awk regex doesn't work when match ip address - regex

I wanna extract ip address in a file,
each line of the file is like:
T 218.241.107.98 167.232.255.245 7 2719 1378473670 N 0 0 0 G 0 I 218.241.107.97,0.146,1 218.241.98.45,0.239,1 192.168.1.253,0.182,1 159.226.253.77,0.210,1 159.226.253.54,0.676,1 159.226.254.254,39.287,1 203.192.137.173,39.335,1 203.192.134.69,50.128,1 61.14.157.141,42.917,1 202.147.61.193,188.165,1 38.104.84.41,201.100,1 154.54.30.193,194.939,1 154.54.41.221,194.915,1 154.54.5.65,237.396,1 154.54.2.81,251.547,1 154.54.24.153,260.946,1 154.54.26.126,256.046,1 154.54.10.14,245.145,1 193.251.240.113,241.663,1 q q q 57.69.31.22,283.784,1;57.69.31.22,284.763,1
But my awk script doesn't work
#!/usr/bin/awk -f
BEGIN {
FS = "[, \t;]"
}
{
for(i = 4; i <= NF; i++)
{
if ($1 == "#")
continue
if ($i ~ /(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)(\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)){3}/)
printf $i"\t"
if (i == NF)
printf "\n"
}
}
Can anyone figure out what's wrong?
Any help will be really appreaciated, thanks in advance.
PS: there is no output but a new line character

Try this awk
awk -F"[, \t;]+" '!/^#/ {for (i=1;i<NF;i++) if ($i ~ /(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)(\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)){3}/) printf "%s\t",$i;print ""}' file
218.241.107.98 167.232.255.245 218.241.107.97 218.241.98.45 192.168.1.253 159.226.253.77 159.226.253.54 159.226.254.254 203.192.137.173 203.192.134.69 61.14.157.141 202.147.61.193 38.104.84.41 154.54.30.193 154.54.41.221 154.54.5.65 154.54.2.81 154.54.24.153 154.54.26.126 154.54.10.14 193.251.240.113 57.69.31.22 57.69.31.22
This !/^#/ makes it only prints line not starting with #

Related

How to use sed to extract numbers from a comma separated string?

I managed to extract the following response and comma separate it. It's comma seperated string and I'm only interested in comma separated values of the account_id's. How do you pattern match using sed?
Input: ACCOUNT_ID,711111111119,ENVIRONMENT,dev,ACCOUNT_ID,111111111115,dev
Expected Output: 711111111119, 111111111115
My $input variable stores the input
I tried the below but it joins all the numbers and I would like to preserve the comma ','
echo $input | sed -e "s/[^0-9]//g"
I think you're better served with awk:
awk -v FS=, '{for(i=1;i<=NF;i++)if($i~/[0-9]/){printf sep $i;sep=","}}'
If you really want sed, you can go for
sed -e "s/[^0-9]/,/g" -e "s/,,*/,/g" -e "s/^,\|,$//g"
$ awk '
BEGIN {
FS = OFS = ","
}
{
c = 0
for (i = 1; i <= NF; i++) {
if ($i == "ACCOUNT_ID") {
printf "%s%s", (c++ ? OFS : ""), $(i + 1)
}
}
print ""
}' file
711111111119,111111111115

(bash) check if IP in subnet range file

I've a list of subnet range in a file:
2.32.0.0-2.47.255.255-255.240.0.0
2.112.0.0-2.119.255.255-255.248.0.0
2.156.0.0-2.159.255.255-255.252.0.0
2.192.0.0-2.199.255.255-255.248.0.0
...
(The file format is: {startip}-{endip}-{netmask})
I need check if an IP is included in one of the subnet in the file.
You may use awk for that:
echo '127.0.0.0-127.255.255.255-255.0.0.0' | awk -F- '
BEGIN { ip[1] = 127; ip[2] = 0; ip[3] = 0; ip[4] = 1; }
{ split($1, startIp, "."); split($2, endIp, ".");
for(i = 1; i <= 4; i++) {
if(ip[i] < int(startIp[i]) || ip[i] > int(endIp[i]))
break;
}
if(i == 5)
print "matching line: ", $0; }'
IP for searching is initially set as array in BEGIN-clause as array. Each line is compared in for-cycle and if each octet laying between startIp and endIp, "matching line" is printed.
Some Python 3 gibberish relying on ipaddress module from 3.3 (available for 2.6/2.7:
python3 -c 'from ipaddress import ip_address as IP; list(
map(print, ((startip, endip) for startip, endip, _ in
(ip.split("-") for ip in open("tmp/iplist.txt"))
if IP(startip) < IP("127.0.0.1") < IP(endip))))'
Which is actually one-liner version for following script:
import sys
from ipaddress import ip_address as IP
ip = IP(sys.argv[1])
with open(sys.argv[2]) as f:
for line in f:
startIp, endIp, _ = line.split('-')
if IP(startIp) < ip < IP(endIp):
print(line)
Which can be used like that:
$ python3 ipcheck.py 127.0.0.1 iplist.txt
Try this:
BEGIN {
FS="."
ex = "false"
split(address, ip, ".")
}
{
split($0, range, "[-.]")
for (i=1; i<5; i++) {
if (ip[i] < range[i] || ip[i] > range[i+4])
break;
else if ((ip[i] > range[i] && ip[i] < range[i+4]) || i == 4)
ex = "true"
}
}
END {
print ex
}
Invoke this awk script (checkIP.awk) like this:
$ awk -v address="2.156.0.5" -f checkIP.awk /path/to/ip/ranges/file
true
$ awk -v address="0.0.0.0" -f checkIP.awk /path/to/ip/ranges/file
false
You can use this awk script:
awk -F- -v arg='2.158.1.2' 'function ipval(arg) {
split(arg, arr, ".");
s=0;
for (i=1; i<=length(arr); i++)
s += arr[i] * (10**(6-i));
return s
}
ipval(arg) >= ipval($1) && ipval(arg) <= ipval($2)' file
2.156.0.0-2.159.255.255-255.252.0.0
ipval converts given ip address to a numeric value so that it can be compared easily using arithmetical operator.
Check if IP is valid and if it's on a local connection:
if ! ip route get ${ip} | grep -v via > /dev/null;then echo "bad IP";fi
I find this variant of the above script more useful:
function ipval()
{
RES=`awk -F- -v arg="$1" '{
split(arg, arr, ".");
s=0;
for (i=1; i<=length(arr); i++) { s += arr[i] * (256**(4-i)); }
print s
}' <<< '' `
return $RES
}
read -p "IP:" IP
ipval "$IP"
echo "RES=$RES"

using Regex in AWK seems to not find pattern

Hi I am trying to match the following string to no avail
echo '[xxAA][xxBxx][C]' | awk -F '/\[.*\]/' '{ for (i = 1; i <= NF; i++) printf "-->%s<--\n", $i }'
I basically want to have each field be an enclosing bracket such that
field 1 = xxAA
field 2 = xxBxx
field 3 = C
but i keep getting the following result
-->[xxAA][xxBxx][C]<--
any pointers where I am going wrong?
You can use a regex in Field Separator. We enclose the [ and ] in character class to have it considered as literal. Both are separated by | which is logical OR. Since we target them as field separator we just iterate over even field numbers to get the output.
$ echo '[xxAA][xxBxx][C]' | awk -v FS="[]]|[[]" '{ for (i=2;i<=NF;i+=2) print $i }'
xxAA
xxBxx
C
The regex /\[.*\]/ matches the entire input, because the .* matches the ][ inside the input as well as matching the letters.
You could split fields on the ']' character instead, then put it back again in the output:
echo '[xxAA][xxBxx][C]' | awk -F ']' '{ for (i = 1; i <= NF; i++) if ($i != "") printf "-->%s]<--\n", $i }'
This is a job for GNU awk's FPAT variable which lets you specify the pattern of the fields rather than the pattern of the field separators:
$ echo '[xxAA][xxBxx][C]' | awk -v FPAT='[^][]+' '{ for (i = 1; i <= NF; i++) printf "-->%s<--\n", $i }'
-->xxAA<--
-->xxBxx<--
-->C<--
With other awks I'd use:
$ echo '[xxAA][xxBxx][C]' | awk -F'\\]\\[' '{ gsub(/^\[|\]$/,""); for (i = 1; i <= NF; i++) printf "-->%s<--\n", $i }'
-->xxAA<--
-->xxBxx<--
-->C<--

How to have a good output in bash when printing?

I have this command here, and I have a problem achieving a good format.
In this lines,
DATE*2014*09*23
VAL*0001*ABC
N3*Sample
VAL*0002*XYZ
My desired output here is like this:
["ABC", "XYC"]
I tried this code:
perl -nle 'print $& if /VAL\*[0-9]*\*\K.*/' file | awk '{ printf "\"%s\",", $0 }'
resulting only:
"ABC","XYZ",
Another thing is that when printing only one value.
If it happens that a file is like this:
DATE*2014*09*23
VAL*0001*ABC
N3*Sample
my desired output would only be like this (ignoring the output of having []):
"ABC"
You can do all with awk:
#!/usr/bin/awk -f
BEGIN {FS="*"; i=0; ORS=""}
$1=="VAL" {a[i++]=$3}
END {
if (i>1) {
print "[\"" a[0]
for (j = 1; j < i; j++)
print "\",\"" a[j]
print "\"]"
}
if (i==1)
print "\"" a[0] "\""
}

awk remove unwanted records and consolidate multiline fields to one line record in specific order

I have an output file that I am trying to process into a formatted csv for our audit team.
I thought I had this mastered until I stumbled across bad data within the output. As such, I want to be able to handle this using awk.
MY OUTPUT FILE EXAMPLE
Enter password ==>
o=hoster
ou=people,o=hoster
ou=components,o=hoster
ou=websphere,ou=components,o=hoster
cn=joe-bloggs,ou=appserver,ou=components,o=hoster
cn=joe
sn=bloggs
cn=S01234565
uid=bloggsj
cn=john-blain,ou=appserver,ou=components,o=hoster
cn=john
uid=blainj
sn=blain
cn=andy-peters,ou=appserver,ou=components,o=hoster
cn=andy
sn=peters
uid=petersa
cn=E09876543
THE OUTPUT I WANT AFTER PROCESSING
joe,bloggs,s01234565;uid=bloggsj,cn=joe-bloggs,ou=appserver,ou=components,o=hoster
john,blain;uid=blainj;cn=john-blain,ou=appserver,ou=components,o=hoster
andy,peters,E09876543;uid=E09876543;cn=andy-peters,ou=appserver,ou=components,o=hoster
As you can see:
we always have a cn= variable that contains o=hoster
uid can have any value
we may have multiple cn= variables without o=hoster
I have acheived the following:
cat output | awk '!/^o.*/ && !/^Enter.*/{print}' | awk '{getline a; getline b; getline c; getline d; print $0,a,b,c,d}' | awk -v srch1="cn=" -v repl1="" -v srch2="sn=" -v repl2="" '{ sub(srch1,repl1,$2); sub(srch2,repl2,$3); print $4";"$2" "$3";"$1 }'
Any pointers or guidance is greatly appreciated using awk. Or should I give up and just use the age old long winded method a large looping script to process the file?
You may try following awk code
$ cat file
Enter password ==>
o=hoster
ou=people,o=hoster
ou=components,o=hoster
ou=websphere,ou=components,o=hoster
cn=joe-bloggs,ou=appserver,ou=components,o=hoster
cn=joe
sn=bloggs
cn=S01234565
uid=bloggsj
cn=john-blain,ou=appserver,ou=components,o=hoster
cn=john
uid=blainj
sn=blain
cn=andy-peters,ou=appserver,ou=components,o=hoster
cn=andy
sn=peters
uid=petersa
cn=E09876543
Awk Code :
awk '
function out(){
print s,u,last
i=0; s=""
}
/^cn/,!NF{
++i
last = i == 1 ? $0 : last
s = i>1 && !/uid/ && NF ? s ? s "," $NF : $NF : s
u = /uid/ ? $0 : u
}
i && !NF{
out()
}
END{
out()
}
' FS="=" OFS=";" file
Resulting
joe,bloggs,S01234565;uid=bloggsj;cn=joe-bloggs,ou=appserver,ou=components,o=hoster
john,blain;uid=blainj;cn=john-blain,ou=appserver,ou=components,o=hoster
andy,peters,E09876543;uid=petersa;cn=andy-peters,ou=appserver,ou=components,o=hoster
If you want to try this on a Solaris/SunOS system, change awk to /usr/xpg4/bin/awk , /usr/xpg6/bin/awk , or nawk
This awk script works for your sample and produces the sample output:
BEGIN { delete cn[0]; OFS = ";" }
function print_info() {
if (length(cn)) {
names = cn[1] "," sn
for (i=2; i <= length(cn); ++i) names = names "," cn[i]
print names, uid, dn
delete cn
}
}
/^cn=/ {
if ($0 ~ /o=hoster/) dn = $0
else {
cn[length(cn)+1] = substr($0, index($0, "=") + 1)
uid = $0; sub("cn", "uid", uid)
}
}
/^sn=/ { sn = substr($0, index($0, "=") + 1) }
/^uid=/ { uid = $0 }
/^$/ { print_info() }
END { print_info() }
This should help you get started.
awk '$1 ~ /^cn/ {
for (i = 2; i <= NF; i++) {
if ($i ~ /^uid/) {
u = $i
continue
}
sub(/^[^=]*=/, x, $i)
r = length(r) ? r OFS $i : $i
}
print r, u, $1
r = u = x
}' OFS=, RS= infile
I assume that there is an error in your sample output: in the 3d record the uid should be petersa and not E09876543.
You might want look at some of the "already been there and done that" solutions to accomplish the task.
Apache Directory Studio for example, will do the LDAP query and save the file in CSV or XLS format.
-jim