Need a Regex for parsing Apache files - regex

I need a regex for parsing Apache files
For example:
Here is a portion of a /var/log/httpd/error_log
[Sun Sep 02 03:34:01 2012] [notice] Digest: done
[Sun Sep 02 03:34:01 2012] [notice] Apache/2.2.15 (Unix) DAV/2 mod_ssl/2.2.15 OpenSSL/1.0.0- fips SVN/1.6.11 configured -- resuming normal operations
[Sun Sep 02 03:34:01 2012] [error] avahi_entry_group_add_service_strlst("localhost") failed: Invalid host name
[Sun Sep 02 08:01:14 2012] [error] [client 216.244.73.194] File does not exist: /var/www/html/manager
[Sun Sep 02 11:04:35 2012] [error] [client 58.218.199.250] File does not exist: /var/www/html/proxy
I want a regex that includes space as delimiter and excludes embedded space. And the apache error log format alternates between
[DAY MMM DD HH:MM:SS YYYY] [MSG_TYPE] DESCRIPTOR: MESSAGE
[DAY MMM DD HH:MM:SS YYYY] [MSG_TYPE] [SOURCE IP] ERROR: DETAIL
I created 2 Regexes, 1st one is
^(\[[\w:\s]+\]) (\[[\w]+\]) (\[[\w\d.\s]+\])?([\w\s/.(")-]+[\-:]) ([\w/\s]+)$
This one is simple and just match the contents as it is
I want something like the following Regex which I created
(?<=|\s)([\w:\S]+)
This one doesn't give me the desired output, it doesn't include embedded space. So I need a regex which groups each field, includes embedded space and uses space as delimiter. Pls Help me out with the logic!!!!
my code
void regexparser( CharBuffer cb)
{ try{
Pattern linePattern = Pattern.compile(".*\r?\n");
Pattern csvpat = Pattern.compile( "^\\[([\\w:\\s]+)\\] \\[([\\w]+)\\] (\\[([\\w\\d.\\s]+)\\])?([\\w\\s/.(\")-]+[\\-:]) ([\\w/\\s].+)",Pattern.CASE_INSENSITIVE | Pattern.UNICODE_CASE | Pattern.MULTILINE);
Matcher lm = linePattern.matcher(cb);
Matcher pm = null;
while(lm.find())
{ //System.out.print("1st loop");
CharSequence cs = lm.group();
if (pm==null)
pm = csvpat.matcher(cs);
else
pm.reset(cs);
while(pm.find())
{ // System.out.println("2nd loop");
//System.out.println(pm.groupCount());
//CharSequence ps = pm.group();
//System.out.print(ps);
if(pm.group(4)==null)
System.out.println(pm.group(1)+" "+pm.group(2)+" "+pm.group(5)+" "+pm.group(6));
else
System.out.println(pm.group(1)+" "+pm.group(2)+" "+pm.group(4)+" "+pm.group(5)+" "+pm.group(6));

I agree that this task should be done with an existing solution to parse Apache logs.
However, if you want to try something out for training purposes, maybe you want to start with this. Instead of parsing everything in one single huge regex, I do it in small steps that are much better readable:
Code
#!/usr/bin/env perl
use strict;
use warnings;
use DateTime::Format::Strptime;
use feature 'say';
# iterate log lines
while (defined(my $line = <DATA>)) {
chomp $line;
# prepare
my %data;
my $strp = DateTime::Format::Strptime->new(
pattern => '%a %b %d %H:%M:%S %Y',
);
# consume date/time
next unless $line =~ s/^\[(\w+ \w+ \d+ \d\d:\d\d:\d\d \d{4})\] //;
$data{date} = $strp->parse_datetime($1);
# consume message type
next unless $line =~ s/^\[(\w+)\] //;
$data{type} = $1;
# "[source ip]" alternative
if ($line =~ s/^\[(\w+) ([\d\.]+)\] //) {
#data{qw(source ip)} = ($1, $2);
# consume "error: detail"
next unless $line =~ s/([^:]+): (.*)//;
#data{qw(error detail)} = ($1, $2);
}
# "descriptor: message" alternative
elsif ($line =~ s/^([^:]+): (.*)//) {
#data{qw(descriptor message)} = ($1, $2);
}
# invalid
else {
next;
}
# something left: invalid
next if length $line;
# parsed ok: output
say "$_: $data{$_}" for keys %data;
say '-' x 40;
}
__DATA__
[Sun Sep 02 03:34:01 2012] [notice] Digest: done
[Sun Sep 02 03:34:01 2012] [notice] Apache/2.2.15 (Unix) DAV/2 mod_ssl/2.2.15 OpenSSL/1.0.0- fips SVN/1.6.11 configured -- resuming normal operations
[Sun Sep 02 03:34:01 2012] [error] avahi_entry_group_add_service_strlst("localhost") failed: Invalid host name
[Sun Sep 02 08:01:14 2012] [error] [client 216.244.73.194] File does not exist: /var/www/html/manager
[Sun Sep 02 11:04:35 2012] [error] [client 58.218.199.250] File does not exist: /var/www/html/proxy
Output
descriptor: Digest
date: 2012-09-02T03:34:01
type: notice
message: done
----------------------------------------
descriptor: avahi_entry_group_add_service_strlst("localhost") failed
date: 2012-09-02T03:34:01
type: error
message: Invalid host name
----------------------------------------
detail: /var/www/html/manager
source: client
ip: 216.244.73.194
date: 2012-09-02T08:01:14
error: File does not exist
type: error
----------------------------------------
detail: /var/www/html/proxy
source: client
ip: 58.218.199.250
date: 2012-09-02T11:04:35
error: File does not exist
type: error
----------------------------------------
Note that according to your format description, the second line is invalid and ignored by the program.

Related

config fail2ban on joomla

Hello everyone i'm trying to config joomla with fail2ban so i created
the file /etc/fail2ban/filter.d/joomla-error.conf
and added the failregex as below:
failregex = [[]client <HOST>[]] user .* authentication failure.*
After I added this code into the jail.conf
[joomla-error]
enabled = true
port = http,https
filter = joomla-error
logpath = /var/log/httpd/domains/jayjezz.com.error.log
maxretry = 5
bantime = 30
the logpath is right but every time i try to reload fail2ban service i get
ERROR NOK: ("No 'host' group in '[[]client <HOST>[]] user .* authentication failure.*'",)
i think something is wrong with my regex, can someone provide me the right regex for
[Thu Sep 28 17:14:23.932811 2017] [:error] [pid 6673] [client 000.000.000.000:56806] user xxxxx authentication failure, referer: http://jayjezz.com/administrator/index.php
thank you
fixed this by adding a script to change file permissions inside joomla website. now when i cannot login under /administrator without launching the script first

why does fail2ban not match script not found

Why does the following fail2ban regex
failregex = ^%(_apache_error_client)s ((AH001(28|30): )?File does not exist|(AH01264: )?script not found or unable to stat).*$
^%(_apache_error_client)s script '.*' not found or unable to stat
not match
[client 111.111.111.111:51008] script '/srv/www/htdocs/wwwuni/fileadmin/Dokumente/index.php' not found or unable to stat
My problem is solved after changing the definition of
_apache_error_client in apache-common.conf
to _apache_error_client = [[^]]*] [(:error|\S+:\S+)]( [pid \d+])? [client (:\d{1,5})?]

fail2ban custom filter on multiline

Is it possible to catch authentication failure on multiple line with fail2ban regex?
Here is the example :
Sep 08 11:54:59.207814 afpd[16190] {dsi_tcp.c:241} (I:DSI): AFP/TCP session from 10.0.71.149:53863
Sep 08 11:54:59.209504 afpd[16190] {uams_dhx2_pam.c:329} (I:UAMS): DHX2 login: thierry
Sep 08 11:54:59.272092 afpd[16190] {uams_dhx2_pam.c:214} (I:UAMS): PAM DHX2: PAM Success
Sep 08 11:55:01.522258 afpd[16190] {uams_dhx2_pam.c:666} (I:UAMS): DHX2: PAM_Error: Authentication failure
Thanks
Yeah sure, fail2ban uses python regex with the multiline option. In your case, try:
"afpd\[[0-9]+\] {dsi_tcp.c:241} \(I:DSI\): AFP/TCP session from <HOST>:[0-9]+\n.*afpd\[[0-9]+\] {uams_dhx2_pam.c:[0-9]+}.*\n.*afpd\[[0-9]+\] {uams_dhx2_pam.c:[0-9]+}.*\n.*afpd\[[0-9]+\] {uams_dhx2_pam.c:[0-9]+}.*PAM_Error: Authentication failure"
As you can see, you just have to put \n where needed. Don't forgot to set the maxlines option to 4 in your case, so that fail2ban uses 4 lines to match the regex. Your filter file should look something like:
[Init]
maxlines = 4
[Definition]
failregex = "afpd\[[0-9]+\] {dsi_tcp.c:241} \(I:DSI\): AFP/TCP session from <HOST>:[0-9]+\n.*afpd\[[0-9]+\] {uams_dhx2_pam.c:[0-9]+}.*\n.*afpd\[[0-9]+\] {uams_dhx2_pam.c:[0-9]+}.*\n.*afpd\[[0-9]+\] {uams_dhx2_pam.c:[0-9]+}.*PAM_Error: Authentication failure"
ignoreregex =
Use fail2ban-regex to test your regex.
Was just looking for a solution for the same problem - but I think that answer given by wpoely86 can lead to blocking innocent IPs - if there are multiple IPs connecting at more or less the same time.
Sep 08 11:54:59.207814 afpd[16190] {dsi_tcp.c:241} (I:DSI): AFP/TCP session from 10.0.71.149:53863
Sep 08 11:54:59.207815 afpd[99999] {dsi_tcp.c:241} (I:DSI): AFP/TCP session from 10.10.10.10:53864
Sep 08 11:54:59.209504 afpd[16190] {uams_dhx2_pam.c:329} (I:UAMS): DHX2 login: thierry
Sep 08 11:54:59.272092 afpd[16190] {uams_dhx2_pam.c:214} (I:UAMS): PAM DHX2: PAM Success
Sep 08 11:55:01.522258 afpd[16190] {uams_dhx2_pam.c:666} (I:UAMS): DHX2: PAM_Error: Authentication failure
Sep 08 11:55:01.522258 afpd[99999] {uams_dhx2_pam.c:666} (I:UAMS): DHX2: PAM_success: Authentication succeeded
Above, the offending connection came from 10.0.71.149. However, the regex would block 10.10.10.10. In other words, the regex would need to distinguish between afpd[99999] and afpd[16190] (which identify the PID of the afpd process).

How to match only the ip and not letters

I am trying to parse an error log with regex. It will give me everything I want but now I want to omit the text "client", or any text that would be in that place. All I want from between the [] is the ip address.
^\[([^]]+)\]\s*\[([^]]+)\]\s*\[([^]]+)\]\s*([A-Za-z ]*)[:]\s*([\/a-z-]*)$
Here is a line from the log.
[Mon Aug 23 15:25:35 2010] [error] [client 80.154.42.54] File does not exist: /var/www/phpmy-admin
This should do it:
^\[([^]]+)\]\s*\[([^]]+)\]\s*\[[a-zA-Z ]*([0-9.]+)\]\s*([A-Za-z ]*)[:]\s*([\/a-z-]*)$
Working regex example:
http://regex101.com/r/uN3fO3
Matches: (Using your example data..)
1. `Mon Aug 23 15:25:35 2010`
2. `error`
3. `80.154.42.54`
4. `File does not exist`
5. `/var/www/phpmy-admin`

Fail2Ban regex does not match

I'm using fail2ban. For some reason Fail2Ban refuse to compile my regex. Here is my logs that I need to match:
root#server1:/etc/fail2ban/filter.d# tail /var/log/apache2/error.log
[Sun Apr 20 10:40:05 2014] [error] [client 75.144.181.151] user root: authentication failure for "/phpmyadmin/": Password Mismatch
[Sun Apr 20 10:40:16 2014] [error] [client 75.144.181.151] user root: authentication failure for "/phpmyadmin/": Password Mismatch
[Sun Apr 20 10:40:38 2014] [error] [client 75.144.181.151] user haker not found: /phpmyadmin/
[Sun Apr 20 10:40:44 2014] [error] [client 75.144.181.151] user pentest not found: /phpmyadmin/
and here is my fail2ban filter.d file:
root#server1:/etc/fail2ban/filter.d# cat /etc/fail2ban/filter.d/phpmyadmin.conf
[Definition]
failregex = [client <HOST>;] user .*; not found: \/phpmyadmin\/|[client <HOST>;] user root: authentication failure for "\/phpmyadmin\/":
ignoreregex =
here is my regex line from the file above:
[client <HOST>;] user .*; not found: \/phpmyadmin\/|[client <HOST>;] user root: authentication failure for "\/phpmyadmin\/":
Unfortunately fail2ban log file giving me error about regex: Unable to compile regular expression..
root#server1:/etc/fail2ban# tail /var/log/fail2ban.log
2014-04-20 10:47:06,788 fail2ban.filter : INFO Added logfile = /var/log/apache2/error.log
2014-04-20 10:47:06,789 fail2ban.filter : INFO Set maxRetry = 3
2014-04-20 10:47:06,789 fail2ban.filter : INFO Set findtime = 600
2014-04-20 10:47:06,790 fail2ban.actions: INFO Set banTime = 600
2014-04-20 10:47:06,790 fail2ban.filter : ERROR Unable to compile regular expression '[client (?:::f{4,6}:)?(?P<host>[\w\-.^_]+);] user .*; not found: \/phpmyadmin\/|[client (?:::f{4,6}:)?(?P<host>[\w\-.^_]+);] user root: authentication failure for "\/phpmyadmin\/":'
2014-04-20 10:47:06,794 fail2ban.jail : INFO Jail 'ssh' started
2014-04-20 10:47:06,799 fail2ban.jail : INFO Jail 'pureftpd' started
2014-04-20 10:47:06,805 fail2ban.jail : INFO Jail 'phpmyadmin' started
My regex http://regex101.com/r/kU7tX3. What is wrong with this? Any help is appreciated. Thank you.
I would have asked a question in comment but i cannot add a comment:
So trying my best to understand the requirement and giving an answer.
Requirement: I think you are looking to filter all lines containing
"authentication failure for "/phpmyadmin/""
You can do so by changing your regular expression to following:
failregex = .*authentication failure for "\/phpmyadmin\/"
You may have to escape "
Please add comments if this wasn't the correct understanding.....