Bash grep ip from line

Bash grep ip from line - regex

I have the file ip.txt which contain the following
ata001dcfe16f85.mm.ph.ph.cox.net (24.252.231.220)
220.231.252.24.xxx.com (24.252.231.220)
and I made this bash command to extract ips :
grep -Eo '(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)' ip.txt | sort -u > good.txt
I want to edit the code so it extracts the ips between the parentheses ONLY . not all the ips on the line because the current code extract the ip 220.231.252.24

To get the IP within paranthesis all you need is to wrap the entire regex in an escaped \( \)
grep -Eo '\((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\)'
will give output as
(24.252.231.220)
(24.252.231.220)
if you want to get rid of the paranthesis as well in the output, look around would be usefull
grep -oP '(?<=\()(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)(?=\))'
would produce output as
24.252.231.220
24.252.231.220
a much more lighter version would be
grep -oP '(?<=\()(25[0-5]|2[0-4][0-9]|[01]?[0-9]{2}?)(\.(25[0-5]|2[0-4][0-9]|[01]?[0-9]{2}?)){3}(?=\))'
here
[0-9]{2} matches the number 2 times
(\.(25[0-5]|2[0-4][0-9]|[01]?[0-9]{2}?)){3} matches . followed by 3 digit number three times
The repeating lines can be removed using a pipe to uniq as
grep -oP '(?<=\()(25[0-5]|2[0-4][0-9]|[01]?[0-9]{2}?)(\.(25[0-5]|2[0-4][0-9]|[01]?[0-9]{2}?)){3}(?=\))' input | uniq
giving the output as
24.252.231.220

You can try awk
awk -F"[()]" '{print $(NF-1)}' file
24.252.231.220
24.252.231.220

Related

Get line by index from grep output

I'm trying to get the machine's ipv4 address.
I've got this working but is there a way to specify you want index 2 from the returned data?
ip a | grep -E -o "([0-9]{1,3}[\.]){3}[0-9]{1,3}"
Output
127.0.0.1
192.168.13.131
192.168.13.255

With awk, please try following, written as per your attempt.
ip a |
awk '
match($0,/([0-9]{1,3}\.){3}[0-9]{1,3}/) && ++count==2{
print substr($0,RSTART,RLENGTH)
}'
Explanation: Adding detailed explanation for above solution.
ip a |
##Running ip a command and sending its output to awk as input
awk '
##Starting awk program from here.
match($0,/([0-9]{1,3}\.){3}[0-9]{1,3}/) && ++count==2{
##Using match function to match IP address and checking if its 2nd time coming.
print substr($0,RSTART,RLENGTH)
##Printing matching sub string here.
}'

One option is to pipe it to sed and print the second line
ip a | grep -Eo "([0-9]{1,3}\.){3}[0-9]{1,3}" | sed -n 2p
Another option could be using a combination of head and tail, showing the first n items with head and then take the last item from that result.
ip a | grep -Eo "([0-9]{1,3}\.){3}[0-9]{1,3}" | head -n2 | tail -n1
If -P is supported for pcre, you might also use a longer pattern, repeating matching the ip number n times using a quantifier, for example {2} for repeating 2 times.
Then use \K to clear the match buffer and output only the ip number of interest.
ip a | grep -Poz "(?s)\A(?:.*?(?:[0-9]{1,3}\.){3}[0-9]{1,3}){2}.*?\K(?:[0-9]{1,3}\.){3}[0-9]{1,3}"
^^^ quantifier

Parsing log file

I am trying to parse a text like this from a log file:
[2016-01-29 11:31:33,809: WARNING/Worker-1283]
1030140:::DEAL_OF_DAY:::29:::1:::11 [2016-01-29 11:31:34,103:
WARNING/Worker-1197] 1025311:::DEAL_OF_DAY:::29:::1:::11 [2016-01-29
11:31:34,291: WARNING/Worker-1197] 1025158:::DEAL_OF_DAY:::29:::1:::11
I want to extract these numbers 1030140, 1025311, 1025158 and so on.
I have tried the following
cat deals29.txt | egrep -o '[0-9]+'
But this gives other digits as well
I tried
cat deals29.txt | egrep -o ' [0-9]+:::'
but now it gives the colons in the output as well and there is no way to capture the group in the command line version of grep.
Any suggestions? grep solution would be preferred but I can go with sed/awk as well if grep cannot do the job.

Using grep -oP and match reset \K:
grep -oP '^\[.*?\] \K\d+' file.log
1030140
1025311
1025158
If your grep doesn't support -P (PCRE) then use awk:
awk -F '\\] |:::' '{print $2}' file.log
1030140
1025311
1025158

You can train regex here : https://regex101.com/
I get
] [0-9]*
and you have to delete the first 2 chars

You could use a solution like:
(\d{3,})::
# looks for at least 3 digits (or more) followed by two colons
# puts the matched numbers in group 1
See a demo for this approach here.

Sed : print all lines after match

I got my research result after using sed :
zcat file* | sed -e 's/.*text=\(.*\)status=[^/]*/\1/' | cut -f 1 - | grep "pattern"
But it only shows the part that I cut. How can I print all lines after a match ?
I'm using zcat so I cannot use awk.
Thanks.
Edited :
This is my log file :
[01/09/2015 00:00:47] INFO=54646486432154646 from=steve idfrom=55516654455457 to=jone idto=5552045646464 guid=100021623456461451463 n
um=6 text=hi my number is 0 811 22 1/12 status=new survstatus=new
My aim is to find all users that spam my site with their telephone numbers (using grep "pattern") then print all the lines to get all the information about each spam. The problem is there may be matches in INFO or id, so I use sed to get the text first.

Printing all lines after a match in sed:
$ sed -ne '/pattern/,$ p'
# alternatively, if you don't want to print the match:
$ sed -e '1,/pattern/ d'
Filtering lines when pattern matches between "text=" and "status=" can be done with a simple grep, no need for sed and cut:
$ grep 'text=.*pattern.* status='

You can use awk
awk '/pattern/,EOF'
n.b. don't be fooled: EOF is just an uninitialized variable, and by default 0 (false). So that condition cannot be satisfied until the end of file.
Perhaps this could be combined with all the previous answers using awk as well.

Maybe this is what you actually want? Find lines matching "pattern" and extract the field after text= up through just before status=?
zcat file* | sed -e '/pattern/s/.*text=\(.*\)status=[^/]*/\1/'
You are not revealing what pattern actually is -- if it's a variable, you cannot use single quotes around it.
Notice that \(.*\)status=[^/]* would match up through survstatus=new in your example. That is probably not what you want? There doesn't seem to be a status= followed by a slash anywhere -- you really should explain in more detail what you are actually trying to accomplish.
Your question title says "all line after a match" so perhaps you want everything after text=? Then that's simply
sed 's/.*text=//'
i.e. replace up through text= with nothing, and keep the rest. (I trust you can figure out how to change the surrounding script into zcat file* | sed '/pattern/s/.*text=//' ... oops, maybe my trust failed.)

The seldom used branch command will do this for you. Until you match, use n for next then branch to beginning. After match, use n to skip the matching line, then a loop copying the remaining lines.
cat file | sed -n -e ':start; /pattern/b match;n; b start; :match n; :copy; p; n ; b copy'

zcat file* | sed -e 's/.*text=\(.*\)status=[^/]*/\1/' | ***cut -f 1 - | grep "pattern"***
instead change the last 2 segments of your pipeline so that:
zcat file* | sed -e 's/.*text=\(.*\)status=[^/]*/\1/' | **awk '$1 ~ "pattern" {print $0}'**

Regex to match an IP adress within a colon and a slash with grep

The lines in the file I want to search look like this:
log:192.1.1.128/50098
log:192.1.1.11/22
...
Now I tried the following RegEx but none of them worked:
grep -oE "\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b" file
grep -oE "\b((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)(\.|$)){4}\b"
grep -oE "\b(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?\.){3}(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b"

You can do this without regex using awk (on this simple example):
awk -F":|/" '{print $2}' file
192.1.1.128
192.1.1.11
To test if its IP contains three .:
awk -F":|/" '{n=split($2,a,".");if (n=4) print $2}' file
192.1.1.128
192.1.1.11

You could use grep also.
$ grep -oP '.*?:\K[^/]*(?=/)' file
192.1.1.128
192.1.1.11
Grep's extended regexp parameter -E won't support \d, you need to use [0-9] instead of \d.
$ grep -oE "\b[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\b" file
192.1.1.128
192.1.1.11

What's the right regex to exclude a large number?

I need to grep through a server's access log for lines that do not contain the number 12345 but do contain the word omgspecialword.
What's the regex that will allow me to grep for these lines?

You don't need regex if the number and the word are fixed, just use | to pipe and filter your results through 2 different grep's
cat file_name | grep -v 12345 | grep omgspecialword
Explanation:
cat file_name | - cat prints the content of file_name and pipes it into the next segment
grep -v 12345 | excludes the lines that contain matching pattern 12345, then pipes the result into the next segment
grep omgspecialword filters the lines that contain matching pattern omgspecialword. Since it's not piped into anything else here, this is printed to stdout.

grep 'omgspecialword' your_file|grep -v 12345
or
awk '$0!~/12345/ && /omgspecialword/' your_file
or
perl -lne 'if(/omgspecialword/ && !(/12345/)){print}' your_file

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Bash grep ip from line - regex

You can try awk awk -F"[()]" '{print $(NF-1)}' file 24.252.231.220 24.252.231.220

Related

Get line by index from grep output

Parsing log file

Sed : print all lines after match

Regex to match an IP adress within a colon and a slash with grep

What's the right regex to exclude a large number?

Categories

Resources