grep multiple regex of a line - regex

I want to log incoming syslog from my router to a file. I recieve the syslog with
nc -l -u -p 514 > syslog.log
The incoming lines are made of several fields that are separated by a whitespace.
Here are two complete sample lines from syslog:
<4>Nov 29 16:15:29 kernel: [ 3571.330000] DROP IN=vlan2 OUT= MAC=00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00 SRC=1.235.114.117 DST=1.52.79.209 LEN=337 TOS=0x00 PREC=0x00 TTL=115 ID=30831 PROTO=UDP SPT=161 DPT=220 LEN=317
<4>Nov 29 16:15:30 kernel: [ 3572.200000] DROP IN=vlan2 OUT= MAC=00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00 SRC=7.27.203.227 DST=122.2.79.209 LEN=64 TOS=0x00 PREC=0x00 TTL=52 ID=44018 DF PROTO=TCP SPT=5108 DPT=220 SEQ=3468909622 ACK=0 WIND
I want only the Time,SRC,PROTO,SPT,DPT fields in my logifile so I thought I could use something like this as a test for DST and SRC only:
nc -l -u -p 514 | egrep -o 'SRC=[0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3}|DST=[0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3}' > syslog.log
Unfortatly this prints every field in a new line like this:
SRC=1.235.114.117
DST=1.52.79.209
SRC=7.27.203.227
DST=122.2.79.209
I then want a output looking simmilar to this corresponding to the first line:
Time,SRC,PROTO,SPT,DPT
Nov 29 16:15:29,7.27.203.227,TCP,5108,220
There is another problem. Sometimes I receive lines that does not contain a "DS" field like in the second line of the samples. So counting fields with awk separators seems not to work since they are not consistent.
Anyone an idea how I can do this?

As #twalberg suggested I now use syslog-ng.
this is my syslog-ng.conf:
#version: 3.7
#include "scl.conf"
options {
threaded(yes);
chain_hostnames(no);
stats_freq(43200);
mark_freq(3600);
};
source s_udp { udp(port(514)); };
parser p_kv {kv-parser(prefix(".kv.")); };
destination d_router_file { file("/var/log/firewall_drops.csv" template("${DATE},${.kv.SRC}${.kv.PROTO},${.kv.SPT},${.kv.DPT}\n")); };
log { source(s_udp);parser(p_kv);destination(d_router_file); };

Related

AWK catching a regular expression

I have been using this little script for months now with success. Today I realize there is one output it cant seem to catch, screen comes up blank with a new prompt:
user#computer ~]$ myscan ipsFile 23
user#computer ~]$
Here is the code
#!/bin/bash
sudo nmap -v -Pn -p T:$2 -reason -i $1 | awk ' {
if (/syn-ack/) {
print "Yes"
c++
}
else if (/no-response|reset|host-unreach/) {
print "No"
c++
}
}
END { print c} '
If I run the nmap against one of the IPs then it returns
Starting Nmap 5.51 ( http://nmap.org ) at 2017-09-26 11:44 CDT
Initiating Parallel DNS resolution of 1 host. at 11:44
Completed Parallel DNS resolution of 1 host. at 11:44, 0.00s elapsed
Initiating Connect Scan at 11:44
Scanning 1.1.1.1 [1 port]
Completed Connect Scan at 11:44, 0.20s elapsed (1 total ports)
Nmap scan report for 1.1.1.1
Host is up, received user-set (0.20s latency).
PORT STATE SERVICE REASON
23/tcp filtered telnet host-unreach
Read data files from: /usr/share/nmap
Nmap done: 1 IP address (1 host up) scanned in 0.26 seconds
How can I catch the 'host-unreach' portion?
Let's try and debug this. Execute this:
nmap -v -Pn -p T:23 -reason -i ipsFile | awk '{print $0}/syn-ack/{print "Yes";c++}/no-response|reset|host-unreach/{print "No";c++}END {print c}' > out.txt
The only difference here is that the awk script prints $0 (i.e. the output of your nmap calls) to file out.txt. Try to grep your unreach value.
I tried this myself and found that instead of a host-unreach I got a net-unreach. Might be the same thing in your case.
Have you tried piping stderr to stdout like
#!/bin/bash
sudo nmap -v -Pn -p T:$2 -reason -i $1 2>&1 | awk ' {
if (/syn-ack/) {
print "Yes"
c++
}
else if (/no-response|reset|host-unreach/) {
print "No"
c++
}
}
END { print c} '

fabric: why can't I get local("history") to print out anything?

Here's my fabfile
from fabric.api import local, task
#task
def tracking(suffix=""):
buffer_ = "*" * 40
print (buffer_)
local("whoami")
print (buffer_)
local("env | grep dn")
#this one comes out empty...
print (buffer_)
out = local("history")
print (buffer_)
Everything prints out as expected, except for the history:
****************************************
[localhost] local: whoami
jluc
****************************************
[localhost] local: env | grep dn
dn_cb=/Users/jluc/.berkshelf/cookbooks
dn_cc=/Users/jluc/kds2/chef/chef-repo/cookbooks
dn_khtmldump=/Users/jluc/kds2/out/tests/dump2static
dn_cv=/Users/jluc/kds2/chef/vagrant/ubuntu2
****************************************
[localhost] local: history
****************************************
But nothing wrong with history on the command line...
history | tail -5
613 history
614 fab -f fabfile2.py tracking
615 history | tail -5
616 cls
617 history | tail -5
What gives? Adding shell="/bin/bash" didn't help either.
MacOs Sierra
According to the docs:
local is not currently capable of simultaneously printing and capturing output, as run/sudo do. The capture kwarg allows you to switch between printing and capturing as necessary, and defaults to False.
I'd interpret this as meaning if you want the history command to work, you need to capture the output first. Try changing all your local commands to include both shell="/bin/bash", and capture=True

Grep logs between two timestamps in Shell

I am writing a script where I need to grep the logs exactly between two given timestamps . I don't want to use regex as it not full proof. Is there any other way through which I can achieve this ?
e.g: between time range 04:15:00 to 05:15:00
Log Format:
170.37.144.10 - - [17/Dec/2015:04:00:00 -0500] "GET /abc/def/ghi/xyz.jsp HTTP/1.1" 200 337 3440 0000FqZTmTG2yuMTJeny7hPDOvG
170.37.144.10 - - [17/Dec/2015:05:10:09 -0500] "POST /abc/def/ghi/xyz.jsp HTTP/1.1" 200 27 21124 0000FqZTmTG2yuMTJ
This might be what you want to do, using GNU awk for time functions:
$ cat tst.awk
BEGIN { FS="[][ ]+"; beg=t2s(beg); end=t2s(end) }
{ cur = t2s($4) }
(cur >= beg) && (cur <= end)
function t2s(time, t) {
split(time,t,/[\/:]/)
t[2]=(match("JanFebMarAprMayJunJulAugSepOctNovDec",t[2])+2)/3
return mktime(t[3]" "t[2]" "t[1]" "t[4]+0" "t[5]+0" "t[6]+0)
}
$ awk -v beg="17/Dec/2015:04:15" -v end="17/Dec/2015:05:15" -f tst.awk file
access_log.aging.20151217040207:170.37.144.10 - - [17/Dec/2015:05:10:09 -0500] "POST /abc/def/ghi/xyz.jsp HTTP/1.1" 200 27 21124 0000FqZTmTG2yuMTJ
but it's hard to guess without more sample input and expected output.
If you don't want to use regular expressions nor patterns for matching lines, then grep alone is not enough.
Here's a Bash+date solution:
# start and stop may be parameters of your script ("$1" and "$2"),
# here they are hardcoded for convenience.
start="17/Dec/2015 04:15:00 -0500"
stop="17/Dec/2015 05:15:00 -0500"
get_tstamp() {
# '17/Dec/2015:05:10:09 -0500' -> '17/Dec/2015 05:10:09 -0500'
datetime="${1/:/ }"
# '17/Dec/2015 05:10:09 -0500' -> '17 Dec 2015 05:10:09 -0500'
datetime="${datetime//// }"
# datetime to unix timestamp
date -d "$datetime" '+%s'
}
start=$(get_tstamp "$start")
stop=$(get_tstamp "$stop")
while read -r line
do
datetime="${line%%:*}" # remove ':170.37.144.10 ...'
tstamp="$(get_tstamp "$datetime")"
# $tstamp now contains a number like 1450347009;
# check if it is in range $start..$stop
[[ "$tstamp" -ge "$start" && "$tstamp" -le "$stop" ]] && echo "$line"
done

Bash Script: sed/awk/regex to match an IP address and replace

I have a string in a bash script that contains a line of a log entry such as this:
Oct 24 12:37:45 10.224.0.2/10.224.0.2 14671: Oct 24 2012 12:37:44.583 BST: %SEC_LOGIN-4-LOGIN_FAILED: Login failed [user: root] [Source: 10.224.0.58] [localport: 22] [Reason: Login Authentication Failed] at 12:37:44 BST Wed Oct 24 2012
To clarify; the first IP listed there "10.224.0.2" was the machine the submitted this log entry, of a failed login attempt. Someone tried to log in, and failed, from the machine at the 2nd IP address in the log entry, "10.224.0.58".
I wish to replace the first occurrence of the IP address "10.224.0.2" with the host name of that machine, as you can see presently is is "IPADDRESS/IPADDRESS" which is useless having the same info twice. So here, I would like to grep (or similar) out the first IP and then pass it to something like the host command to get the reverse host and replace it in the log output.
I would like to repeat this for the 2nd IP "10.224.0.58". I would like to find this IP and also replace it with the host name.
It's not just those two specific IP address though, any IP address. So I want to search for 4 integers between 1 and 3, separated by 3 full stops '.'
Is regex the way forward here, or is that over complicating the issue?
Many thanks.
Replace a fixed IP address with a host name:
$ cat log | sed -r 's/10\.224\.0\.2/example.com/g'
Replace all IP addresses with a host name:
$ cat log | sed -r 's/[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}/example.com/g'
If you want to call an external program, it's easy to do that using Perl (just replace host with your lookup tool):
$ cat log | perl -pe 's/(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})/`host \1`/ge'
Hopefully this is enough to get you started.
There's variou ways to find th IP addresses, here's one. Just replace "printf '<<<%s>>>' " with "host" or whatever your command name is in this GNU awk script:
$ cat tst.awk
{
subIp = gensub(/\/.*$/,"","",$4)
srcIp = gensub(/.*\[Source: ([^]]+)\].*/,"\\1","")
"printf '<<<%s>>>' " subIp | getline subName
"printf '<<<%s>>>' " srcIp | getline srcName
gsub(subIp,subName)
gsub(srcIp,srcName)
print
}
$
$ gawk -f tst.awk file
Oct 24 12:37:45 <<<10.224.0.2>>>/<<<10.224.0.2>>> 14671: Oct 24 2012 12:37:44.583 BST: %SEC_LOGIN-4-LOGIN_FAILED: Login failed [user: root] [Source: <<<10.224.0.58>>>] [localport: 22] [Reason: Login Authentication Failed] at 12:37:44 BST Wed Oct 24 2012
googled this one line command together. but was unable to pass the founded ip address to the ssh command:
sed -n 's/\([0-9]\{1,3\}\.\)\{3\}[0-9]\{1,3\}/\nip&\n/gp' test | grep ip | sed 's/ip//' | sort | uniq
the "test" is the file the sed command is searching for for the pattern

Parsing CVS History Output

I just need to get a list of the most recent changes from CVS and parse them.
Example: The CVS user "Lollerskates" checked in a file with spaces. But spaces are the delimiter! And then "skates" checked in a file with a space in a folder name.
% cvs history -c -a -D 2011-03-14
A 2011-03-15 00:17 +0000 jschmoe 1.1 CoolCode.java Awesome/Source/Java/src/com/widgets/foo/ambiguous/abstraction == <remote>
M 2011-03-15 00:17 +0000 sumbody 1.2 MoreCoolCode.java Awesome/Source/Java/src/com/widgets/foo/ambiguous/abstraction == <remote>
A 2011-03-15 00:17 +0000 lollerskates 1.123 This File Name Has Spaces.html Awesome/Source/Java/src/com/widgets/foo/ambiguous/abstraction == <remote>
A 2011-03-15 00:17 +0000 jschmoe 1.1 MyAwesomeProject.java Awesome/Source/Java/src/com/widgets/foo/ambiguous/abstraction == <remote>
M 2011-03-15 00:17 +0000 skates 1.5 BlahBlah.java Awesome/Source/Java/src/com/widgets/foo/content/block type/cart == <remote>
What is a reliable way to parse this?
Alternatively, is there a different CVS command with more easily parsable results?
This regex captures all of these:
\w \d{4}-\d{2}-\d{2} \d{2}:\d{2} \+\d{4} (\w+)\s+(\d+.\d+)\s+([\w\s]+\.\w+)\s+([\w\s/]+)== \<remote\>
The user is in group #1, filename in group #3 and path in group #4.
In this very case probably cut is a better way? If the fields are fixed length...