Extract Source IP from log files - regex

i want to extract "srcip=x.x.x.x" from log file in bash. my log file is like this:
2019:06:23-17:50:03 myhost ulogd[5692]: id="2021" severity="info" sys="SecureNet" sub="packetfilter" name="Packet dropped (GEOIP)" action="drop" fwrule="60019" initf="eth0" srcmac="3c:1e:04:92:6f:fb" dstmac="00:50:56:97:7c:af" srcip="185.53.91.50" dstip="192.168.50.10" proto="6" length="44" tos="0x00" prec="0x00" ttl="235" srcport="54522" dstport="5038" tcpflags="SYN"
I've wrote awk '{print $15}' to extract srcip but the problem is srcip position not same in each line. how can i extract srcip=x.x.x.x without position of that?

With any sed in any shell on every UNIX box:
$ sed -n 's/.*\(srcip="[^"]*"\).*/\1/p' file
srcip="185.53.91.50"

The following command provides the result you expect
grep -o -P 'srcip="(\d{1,3}[.]){3}\d{1,3}"' log
The option o is to print only the matched parts. The option P is to use perl-compatible regular expressions. The regex is matching srcip=<ipv4> and log is the name of the file you want to extract content from.
Here is a link to regex101 for an explanation for the regex: https://regex101.com/r/hjuZlM/2

An awk version
awk -F"srcip=" '{split($2,a," ");print FS a[1]}' file
srcip="185.53.91.50"
Split the line using the key word, then get the next field after split.

Related

Grepping for a pattern followed by another pattern and excluding what lies inbetween as ouput

I want to do something like
egrep -o '(mon|tues)[1-3]?[0-9].*(mon|tues)[1-3]?[0-9]'
And only get what isn't found by the (mon|tues)[1-3]?[0-9]
With this as input
mon19hellotues20
mon19world
hellomon19
tues8worldtues22
I want
mon19tues20
tues8tues22
As output
sed is better tool for this to print certain matched txt in output:
sed -nE 's/(mon|tues)([1-3]{0,1}[0-9]).*(mon|tues)([1-3]{0,1}[0-9])/\1\2\3\4/p' file
mon19tues20
tues8tues22

How can i replace the existing mac add with a new mac address using sed

I have this existing pattern:
ethernet0.generatedAddress = "00:50:56:bf:71:06"
I need to replace a mac address in the above expression with a new mac address using a sed pattern.
Note: The mac-address that needs to replaced is not same everytime.
I tried this sed expression , but no luck..
sed 's/ethernet0.generatedAddress.*=.*\"([[:xdigit:]]{1,2}:){5}[[:xdigit:]]{1,2}/ethernet0.generatedAddress = \"00:16:3e:5e:1d:01'
Thanks in Advance
Pattern:
([a-z0-9]{2}:[a-z0-9]{2}:[a-z0-9]{2}:[a-z0-9]{2}:[a-z0-9]{2}:[a-z0-9]{2})
Or the following one if uppercase letters are used
([a-zA-Z0-9]{2}:[a-zA-Z0-9]{2}:[a-zA-Z0-9]{2}:[a-zA-Z0-9]{2}:[a-zA-Z0-9]{2}:[a-zA-Z0-9]{2})
Replacement:
new_mac_address // for instance 00:f6:a0:ff:f1:06
Side note: As has been pointer in the comments below, escape parentheses and curly brackets if needed or use -r option
Using sed it would be something like this (just tested)
sed -r 's/(.*)([a-zA-Z0-9]{2}:[a-zA-Z0-9]{2}:[a-zA-Z0-9]{2}:[a-zA-Z0-9]{2}:[a-zA-Z0-9]{2}:[a-zA-Z0-9]{2})(.*)/\1\NEW_MAC_ADDRESS\3/g' file.txt
Use -i option in addition if you want to replace the file on-the-fly
Content of the tested file (file.txt)
something before ethernet0.generatedAddress = "00:50:56:bf:71:06" and something after
Demo
Why not use awk? It gives simple and easy to understand solution.
cat file
some data
ethernet0.generatedAddress = "00:50:56:bf:71:06"
more data
mac="ab:11:23:55:11:cc"
awk -v m="$mac" -F' "|"' '/ethernet0.generatedAddress/ {$2="\""m"\""}1' file
some data
ethernet0.generatedAddress = "ab:11:23:55:11:cc"
more data
It search for ethernet0.generatedAddress, if found, replace field #2 separated by " with new mac.
If one extra space is not problem, this would be more clean:
awk -v m="$mac" -F\" '/ethernet0.generatedAddress/ {$2=FS m FS}1' file
some data
ethernet0.generatedAddress = "ab:11:23:55:11:cc"
more data
Or this:
awk -v m="\"$mac\"" -F\" '/ethernet0.generatedAddress/ {$2=m}1' file

Match URL pattern within file using SED, AWK or GREP

I am trying to use grep to extract a list of urls beginning with http and ending with jpg.
grep -o 'picturesite.com/wp-content/uploads/.......' filename
The code above is how far I've gotten. I then need to pass these file names to curl
title : "Family Vacation", jpg:"http://picturesite.com/wp-content/uploads/2014/01/mypicture.jpg", owner : "PhotoTaker"
sed -nr 's/http\S*(jpg\|gif\|other\|ext)/\
curl $CURLOPTS & >$OUT/p' <$infile | sh -n
The above command will search $infile for any string beginning with "http" followed by any length of non-whitespace characters and ending with any of the "\|" separated file extensions contained in the parentheses.
Once it's found such a string sed will substitute it into the curl commandline on the second line to replace "&." It will then pipe the command string to sh for execution.
Remember, sed is the stream editor, not just the stream searcher, so it can very capably pre-process input for other commands to make them do what you want.
Note: sh is currently passed the 'noexecute' argument which basically works more like echo than anything else. When you've run it a few times and are satisfied you're doing the right thing you'll need to remove it for any effect.
Note 2: If there's a chance you'll want to match more than one url per line you'll need the 'g' sed option.
You can capture url patterns by doing:
grep -o 'http.*.jpg' file
$ grep -o 'http.*.jpg' <<EOF
> title : "Family Vacation", jpg:"http://picturesite.com/wp-content/uploads/2014/01/mypicture.jpg", owner : "PhotoTaker
> EOF
http://picturesite.com/wp-content/uploads/2014/01/mypicture.jpg
curl does not take url from standard input so your best bet would be to store the extracted url to a file and then reading the file one line at a time and passing the variable that holds the line to curl command.

How to cut/get all patterns of RE from one line

How does one get all instances, and only the instances of a regular expression contained within a single line or string?
For example, suppose the output (all one single line) from a webpage is:
<Table border=1 cellpadding=2><TR><TH><font size=2>LAN IP BLOCK</font></TH><TH><font size=2>CUST_NAME</font></TH> <TH><font size=2>ID
</TH></TR><TR><TD><font size=2>10.4.4.0 / 29</font></TD><TD><font size=2>Customer data</font></TD><TD><font size=2></font></TD></T
TD><font size=2>10.1.1.0 / 27</font></TD><TD><font size=2>Customer</font></TD><TD><font size=2></font></TD></TR></Table><p>
I'd like to get every instance of the IP CIDR data. I know I've have to use an IP address RE (and I believe I can figure/find that out), but how do I get EACH instance and remove all other text simply? I'd like to do this on the command line with grep/sed etc. but thinking I need to use python. I know I could use Perl but I'd have to get that installed.
The grep options -o and -E are what you are looking for:
grep -oE "pattern1|pattern2|pattern3|pattern4|...|patternN" input_file
From man grep:
-o, --only-matching
Print only the matched (non-empty) parts of a matching line,
with each such part on a separate output line.
-E, --extended-regexp
Interpret PATTERN as an extended regular expression
(-E is specified by POSIX.)

grep - search for "<?\n" at start of a file

I have a hunch that I should probably be using ack or egrep instead, but what should I use to basically look for
<?
at the start of a file? I'm trying to find all files that contain the php short open tag since I migrated a bunch of legacy scripts to a relatively new server with the latest php 5.
I know the regex would probably be '/^<\?\n/'
I RTFM and ended up using:
grep -RlIP '^<\?\n' *
the P argument enabled full perl compatible regexes.
If you're looking for all php short tags, use a negative lookahead
/<\?(?!php)/
will match <? but will not match <?php
[meder ~/project]$ grep -rP '<\?(?!php)' .
find . -name "*.php" | xargs grep -nHo "<?[^p^x]"
^x to exclude xml start tag
if you worried about windows line endings, just add \r?.
grep '^<?$' filename
Don't know if that is showing up correctly. Should be
grep ' ^ < ? $ ' filename
Do you mean a literal "backslash n" or do you mean a newline?
For the former:
grep '^<?\\n' [files]
For the latter:
grep '^<?$' [files]
Note that grep will search all lines, so if you want to find matches just at the beginning of the file, you'll need to either filter each file down to its first line, or ask grep to print out line numbers and then only look for line-1 matches.