notepad++ regular expression find and count or copy

notepad++ regular expression find and count or copy - regex

I want to grab all occurrences in configuration file where first line starts from 'object' and immediately second line starts with 'nat'
object network obj_any
nat (inside,outside) dynamic interface
object network obj-test
nat (DMZ1,outside) static 10.206.49.180
object network obj-192.168.236.200
nat (DMZ1,outside) static 10.206.74.60
object network obj-192.168.236.8
nat (DMZ1,outside) static 10.206.49.183 tcp 8080 80
object network obj-192.168.236.9
nat (DMZ1,outside) static 10.206.49.178 tcp 1002 22
object network obj-192.168.236.10
nat (DMZ1,outside) static 10.206.49.178 tcp 8080 80
object network obj-192.168.236.13
nat (DMZ1,outside) static 10.206.74.58 dns
I tried below but seems not working
object network .+? nat .+? static .+?
and selected 'match new line" but seems not matching

I believe that this cannot be done in one step with Notepad++. A multi-step process to copy the lines is as follows.
(1) Find the wanted pairs of lines and merge them into single lines with a marker string. (2) Bookmark all lines with the marker and copy them. (3) Paste the wanted lines into a new buffer and convert the marker strings back to newlines.
In more detail.
(Setup) Choose a marker string, something that does not occur anywhere in the buffer being searched or in the destination buffer. For this example I choose !!!.
(1) Do a regular expression replace of ^(object.*)\R+( nat.*)$ with `\1!!!\2'. This converts the wanted lines so the first pair shown in the question become:
object network obj_any!!! nat (inside,outside) dynamic interface
(2) Open the search window and select the Mark tab. Click on Clear all marks, tick Bookmark line, enter the marker string (i.e. !!!) into the Find what field and click on Mark all. Select menu => Search =>
Bookmark => Copy bookmarked lines.
(3) Select the place where the copied lines should be written and Paste in the copied lines. Do a regular expression search and replace of !!! with \r\n\r\n. (May need to alter the replacement string if you preferred line endings are not Windows.)
Notes
The above does not preserve the exact sequence of CRs and LFs between the two lines. The first replace uses \R+ to find any combination of CRs and LFs between the two lines. The final replace inserts a fixed CR and LF sequence.
Rather than using the Copy bookmarked lines it may be suitable to use Remove unmarked lines which then leaves only the wanted lines in the buffer. The Copy and Paste command are then not needed and the final search and replace can be done in the initial buffer.

Related

Select multiple variables with Regex inside a single string

Regex101 link
https://regex101.com/r/wOwFEV/2
Background
I have a dump of nmap reports and I want to extract data from to digest.
I have various inputs similar to:
23/tcp open telnet SMC SMC2870W Wireless Ethernet Bridge
The latter three variables change, but the common denominator is:
The first value is ALWAYS 23/tcp
They are ALWAYS separated by more than one space
There will ALWAYS be four values
I would like to use Regex to pluck each "variable" and assign it to a group.
Right now, I have
(?sm)(?=^23\/tcp)(?<port>.*?)\s*open
Which grabs 23/tcp and assigns it to <port>
But I also want to grab:
open and assign it to <state>
telnet and assign it to <service>
SMC SMC2870W Wireless Ethernet Bridge and assign it to <description>
If not an answer, I think knowing how to grab values between '2 or more' white spaces will solve this, but I can't find any similar examples!

A more specific regexp is:
(?sm)(?=^23\/tcp)(?<port>\d+\/\w+)\s+(?<state>\w*?)\s+(?<service>\w*?)\s+(?<description>.*?)\s$
This restricts the port to be digits/alphanumeric, and state and service to be alphanumeric. It only uses .* for the description, since it's arbitrary text.
And with this change, it's not necessary to require that there be at least 2 spaces between each field, it will work with any number of spaces.
DEMO

Nevermind, got it.
(?sm)(?=^23\/tcp)(?<port>.*?)\s{2,}(?<state>.*?)\s{2,}(?<service>.*?)\s{2,}(?<description>.*?)$
Will do exactly what I described.
https://regex101.com/r/wOwFEV/3

Searching for an unknown IP using FINDSTR

I have text files with hundreds of entries like those below. They mostly come in pairs of 2 IPs. Sometimes they come as 3 IPs. I am trying to find that third IP that is always in the middle of the stack (syntax below). There are maximum 3 different IPs in each file at all times. It is possible that some text files won’t have that middle IP (its occurrence is quite rare). How do I write the search command to find the middle IP from mentioned stacks if there is one in the text file? OS: Win7.
Text file sample syntax:
- saving IP addresses
* 192.168.1.1
* 111.111.222.222
- over
- saving IP addresses
* 192.168.1.1
* 11.123.11.123
* 111.111.222.222
- over
- saving IP addresses
* 192.168.1.1
* 111.111.222.222
- over
I have tried findstr \-.*\*.*\*.*\- pathtofile.txt This should return the block of 3 IPs if there is such block in the file but it didn't work.

Assuming your real file isn't double-spaced like your sample, the following will output the first line (saving...) and line number of matching blocks. Your real problem is findstr will only output one line even if you are matching across lines, so you will never get the whole block output. You need a better tool.
Note: I am using the JPSoft Take Command escape character to put in CR and LF, but you can create them in real batch files as well, though it isn't easy.
findstr /n /R saving.*^r^n.*\..*\..*\..*^r^n.*\..*\..*\..*^r^n.*\..*\..*\..*^r^n sampleIPinput.txt

How to safely bypass tabs in RegEx

I'm using C to do my regular expressions. Things work except for when the input string contains tabs.
This is my RegEx I plug into the regcomp function:
(DROP).*(tcp).*([\\.0-9]+).*0\\.0\\.0\\.0.*dpt:([0-9]+)(.*)
Regcomp returned OK with no issues.
I then used the following string to do the matching with:
DROP\ttcp\t--\t202.153.39.52\t0.0.0.0/0\ttcp dpt:21
I'm using such string to simulate output of iptables because I want to make a program to see which IPs are already listed.
When I execute my program, I receive the following pieces of output after executing the RegEx where the first line is data from the first offset:
DROP tcp -- 202.153.39.52 0.0.0.0/0 tcp dpt:21
DROP
tcp
2
21
Everything is correct except the second-last value. It shows 2, but I expect it to be 202.153.39.52. and I used ([\\.0-9]+) in my RegEx to try to specifically state I only want numbers and dots to match.
How do I fix my RegEx?
UPDATE
I then proceeded to use this RegEx instead in hopes I get each individual octet of the IP address
(DROP).*(tcp).*([0-9]+)\\.([0-9]+)\\.([0-9]+)\\.([0-9]+).*(0\\.0\\.0\\.0).*dpt:([0-9]+)
This is my result:
DROP tcp -- 202.153.39.52 0.0.0.0/0 tcp dpt:21
DROP
tcp
2
153
39
52
0.0.0.0
21
Now this means the first ([0-9]+) isn't processing properly. I should receive a 202, not a 2. Is there something I'm doing wrong? Do I need a special flag for any RegEx function?

I think you're confused about the difference between regex syntax and that syntax encoded as a string (in languages like Java that don't have first class regexes).
Try something more robust and commonsense:
DROP\s+tcp\s+\S+\s+(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\s+0\.0\.0\.0/0\s+tcp\s+dpt:(\d+)
This will capture the ip address and the port number only. Why would you want to capture a fixed string like DROP?
As a string, this is:
"DROP\\s+tcp\\s+\\S+\\s+(\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}\\.\\d{1,3})\\s+0\\.0\\.0\\.0/0\\s+tcp\\s+dpt:(\\d+)"
Use an online regex tester like this one for testing and to convert from regex to string automatically.

Notepad++ - Selecting or Highlighting multiple sections of repeated text IN 1 LINE

I have a text file in Notepad++ that contains about 66,000 words all in 1 line, and it is a set of 200 "lines" of output that are all unique and placed in 1 line in the basic JSON form {output:[{output1},{output2},...}]}.
There is a set of characters matching the RegEx expression "id":.........,"kind":"track" that occurs about 285 times in total, and I am trying to either single them out, or copy all of them at once.
Basically, without some super complicated RegEx terms, I am stuck because I can't figure out how to highlight all of them at once, and also the Remove Unbookmarked Lines feature does not apply because this is all in one line. I have only managed to be able to Mark every single occurrence.
So does this require a large number of steps to get the file into multiple lines and work from there, or is there something else I am missing?
Edit: I have come up with a set of Macro schemes that make the process of doing this manually work much faster. It's another alternative but still takes a few steps and quite some time.
Edit 2: I intended there to be an answer for actually just highlighting the different sections all at once, but I guess that it not possible. The answer here turns out to be more useful in my case, allowing me to have a list of IDs without everything else.

You seem to already have a regex which matches single instances of your pattern, so assuming it works and that we must use Notepad++ for this:
Replace .*?("id":.........,"kind":"track").*?(?="id".........,"kind":"track"|$) with \1.
If this textfile is valid JSON, this opens you up to other, non-notepad++ options, like using Python with the json module.
Edited to remove unnecessary steps

Compare files and return only the differences using Notepad++

Notepad++ has a Compare Plugin tool for comparing text files, which operates like this:
Launch Notepad++ and open the two files you wish to run a comparison
check on.
Click the “Plugins” menu,
Select “Compare” and click “Compare.”
The plugin will run a comparison check and display the two files side
by side, with any differences in the text highlighted.
This is a nice feature, and which I have used happily for some time. Now, I have been looking for an option to go further and select the highlighted differing lines (e.g. by deleting the non-highlighted ones), or vice versa: i.e. expunge the highlighted lines.
Is there a straightforward way to achieve this?

To substract two files in notepad++ (file1 - file2) you may follow this procedure:
Recommended: If possible, remove duplicates on both files, specially if the files are big. To do this: Edit => Line operations => Sort Lines Lexicographically Ascending (do it on both files)
Add ---------------------------- as a footer on file1 (add at least 10 dashes). This is the marker line that separates file1 content from file2.
Then copy the contents of file2 to the end of file1 (after the marker)
Control + H
Search: (?m-s)^(?:-{10,}+\R[\s\S]*+|(.*+)\R(?=(?:(?!^-{10,}$)-++|[^-]*+)*+^-{10,}+\R(?:^.*+\R)*?\1(?:\R|\z))) note: use case sensitivity according to your needs
Replace by: (leave empty)
Select Regular expression radio button
Replace All
You can modify the marker if It is possible that file1/file2 can have lines equal to the marker. In that case you will have to adapt the regular expression.
By the way, you could even record a macro to do all steps (add the marker, switch to file2, copy content to file1, apply the regex with a single button press.
Edited:
Changed the regex to add some improvements:
Speed related:
Avoid as much backtracking as possible
Avoid searching after the mark
Usability:
Dashes are allowed for the lines. But the separator is still ^-{10,}$
Works with other characters besides words
Speed comparison:
New method vs Old method
So basically 78ms vs 1.6seconds. So a nice improvement! That makes comparing Kilobyte-sized files possible.
Still you may want to use some dedicated program for comparing or substracting bigger files.

If the number of differences is not large, a quicker method might be just bookmarking each differing line using keyboard shortcuts. Starting from the beginning of the file, press Alt+Page Down to focus on the first difference, and then press Ctrl+F2 to bookmark it. Continue with alternatingly pressing Alt+Page Down and Ctrl+F2 until the last difference.
With all the differing lines bookmarked, you can use any of the operations under "Search -> Bookmarks" menu:
Cut Bookmarked Lines
Copy Bookmarked Lines
Paste to (Replace) Bookmarked Lines
Remove Bookmarked Lines
Remove Unmarked Lines

I have a dirty workaround for this. It saves some time compared to Control+C, Alt+Tab, Control+V; Control+C, Alt+Tab, Control+V; ... but It may not be worth on big files or if the differences for both files are big. For bigger files you may prefer using some other tool.
Typically this works best when comparing group of 'words' and does not work with content that is tabulated (like source code)
So the workaround is:
Optional: (depends on the content that's being compared) Sort both files (it will make the future comparison easier) To do this: Edit => Line operations => Sort Lines Lexicographically Ascending (do it on both files)
Compare files with the plugin
Choose one file and inspect the lines you want to keep. Add one tabulator before each of those lines. Remeber you can select several lines and press tab for tabulating them. Optionally, you may add tabulators to the lines you want to remove
Sort the file. The tabulated lines will come up first. So now you can copy-paste them (or copy-paste the untabulated ones)

move the files to a linux box and then execute diff command:
$ diff file1.txt file2.txt > file_diff.txt

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

notepad++ regular expression find and count or copy - regex

Related

Select multiple variables with Regex inside a single string

Searching for an unknown IP using FINDSTR

How to safely bypass tabs in RegEx

Notepad++ - Selecting or Highlighting multiple sections of repeated text IN 1 LINE

Compare files and return only the differences using Notepad++

Categories

Resources