Create a cakephp filter for fail2ban - regex

i would like to create a filter in fail2ban for searching and blocking bad request like "Controller class * could not be found."
For this problem i was create a cakephp.conf file in the filter.d directory in fail2ban. The Content:
[Definition]
failregex = ^[0-9]{4}\-[0-9]{2}\-[0-9]{2}.*Error:.*\nStack Trace:\n(\-.*|\n)*\n.*\n.*\nClient IP: <HOST>\n$
ignoreregex =
My example error log looks like this:
...
2020-10-08 19:59:46 Error: [Cake\Http\Exception\MissingControllerException] Controller class Webfig could not be found. in /home/myapplication/htdocs/vendor/cakephp/cakephp/src/Controller/ControllerFactory.php on line 158
Stack Trace:
- /home/myapplication/htdocs/vendor/cakephp/cakephp/src/Controller/ControllerFactory.php:46
- /home/myapplication/htdocs/vendor/cakephp/cakephp/src/Http/BaseApplication.php:249
- /home/myapplication/htdocs/vendor/cakephp/cakephp/src/Http/Runner.php:77
- /home/myapplication/htdocs/vendor/cakephp/authentication/src/Middleware/AuthenticationMiddleware.php:122
- /home/myapplication/htdocs/vendor/cakephp/cakephp/src/Http/Runner.php:73
- /home/myapplication/htdocs/vendor/cakephp/cakephp/src/Http/Runner.php:77
- /home/myapplication/htdocs/vendor/cakephp/cakephp/src/Http/Middleware/CsrfProtectionMiddleware.php:146
- /home/myapplication/htdocs/vendor/cakephp/cakephp/src/Http/Runner.php:73
- /home/myapplication/htdocs/vendor/cakephp/cakephp/src/Http/Runner.php:58
- /home/myapplication/htdocs/vendor/cakephp/cakephp/src/Routing/Middleware/RoutingMiddleware.php:172
- /home/myapplication/htdocs/vendor/cakephp/cakephp/src/Http/Runner.php:73
- /home/myapplication/htdocs/vendor/cakephp/cakephp/src/Routing/Middleware/AssetMiddleware.php:68
- /home/myapplication/htdocs/vendor/cakephp/cakephp/src/Http/Runner.php:73
- /home/myapplication/htdocs/vendor/cakephp/cakephp/src/Error/Middleware/ErrorHandlerMiddleware.php:121
- /home/myapplication/htdocs/vendor/cakephp/cakephp/src/Http/Runner.php:73
- /home/myapplication/htdocs/vendor/cakephp/cakephp/src/Http/Runner.php:58
- /home/myapplication/htdocs/vendor/cakephp/cakephp/src/Http/Server.php:90
- /home/myapplication/htdocs/webroot/index.php:40
Request URL: /webfig/
Referer URL: http://X.X.X.X/webfig/
Client IP: X.X.X.X
...
X.X.X.X are replaced
But i can't match any ip adresses. The fail2ban tester says:
root#test:~# fail2ban-regex /home/myapplication/htdocs/logs/error.log /etc/fail2ban/filter.d/cakephp.conf
Running tests
=============
Use failregex filter file : cakephp, basedir: /etc/fail2ban
Use log file : /home/myapplication/htdocs/logs/error.log
Use encoding : UTF-8
Results
=======
Failregex: 0 total
Ignoreregex: 0 total
Date template hits:
|- [# of hits] date format
| [719] {^LN-BEG}ExYear(?P<_sep>[-/.])Month(?P=_sep)Day(?:T| ?)24hour:Minute:Second(?:[.,]Microseconds)?(?:\s*Zone offset)?
`-
Lines: 15447 lines, 0 ignored, 0 matched, 15447 missed
[processed in 10.02 sec]
Missed line(s): too many to print. Use --print-all-missed to print all 15447 lines
i can't see any problems. Can you help me? :)
Thanks

The issue is your log is poor suitable to parse - it is a multiline log-file (IP takes place in other line as the failure message).
Let alone the line with IP does not has any ID (common information with line of failure), it can be still worse if several messages are crossing (so Client IP from other message that is not a failure, coming after failure message).
If you can change the log-format better do that (so date, IP and failure sign are in the same line), e.g. if you use nginx, organize a conditional logging for access log from php-location in error case like this.
See Fail2ban :: wiki :: Best practice for more info.
If you cannot do that (well better would be to change it), you can use multi-line buffering and parsing using maxlines parameter and <SKIPLINES> regex.
Your filter would be something like that:
[Definition]
# we ignore stack trace, so don't need to hold buffer window too large,
# 5 would be enough, but to be sure (if some log-messages crossing):
maxlines = 10
ignoreregex = ^(?:Stack |- /)
failregex = ^\s+Error: \[[^\]]+\] Controller class \S+ could not be found\..*<SKIPLINES>^((?:Request|Referer) URL:.*<SKIPLINES>)*^Client IP: <HOST>
To test it directly use:
fail2ban-regex --maxlines=5 /path/to/log '^\s+Error: \[[^\]]+\] Controller class \S+ could not be found\..*<SKIPLINES>^((?:Request|Referer) URL:.*<SKIPLINES>)*^Client IP: <HOST>' '^(?:Stack |- /)'
But as already said, it is really ugly - better you find the way to log everything in a single line.

Related

How to disable JSON format and send only the log message to Sumologic with Fluentbit?

We are using Fluentbit as as Sidecar container in our ECS fargate Cluster which is running a dotnet application, initially we faced the issue of fluentbit sending the logs in multiline and we solved it using Fluentbit Multilne feature. Now the logs are being sent to Sumologic in Multiple however it is being sent as Json format whereas we just want fluentbit send only the raw log
Logs are currently
{
date:1675120653.269619,
container_id:"xvgbertytyuuyuyu",
container_name:"XXXXXXXXXX",
source:"stdout",
log:"2023-01-30 23:17:33.269Z DEBUG [.NET ThreadPool Worker] Connection.ManagedDbConnection - ComponentInstanceEntityAsync - Executing stored proc: dbo.prcGetComponentInstance"
}
We want only the line
2023-01-30 23:17:33.269Z DEBUG [.NET ThreadPool Worker] Connection.ManagedDbConnection - ComponentInstanceEntityAsync - Executing stored proc: dbo.prcGetComponentInstance
You need to modify Fluent Bit configuration to have the following filters and output configuration:
fluent.conf:
## prepare headers for Sumo Logic
[FILTER]
Name record_modifier
Match *
Record headers.content-type text/plain
## Set headers as headers attribute
[FILTER]
Name nest
Match *
Operation nest
Wildcard headers.*
Nest_under headers
Remove_prefix headers.
[OUTPUT]
Name http
...
# use log key as body
body_key $log
# use headers key as headers
headers_key $headers
That way, you are going to craft HTTP request manually. This is going to send request per log, which is not necessary a good idea. In order to mitigate that you can add the following parser and use it (flush_timeout may need an adjustment):
parsers.conf
# merge everything as one big log
[MULTILINE_PARSER]
name multiline-all
type regex
flush_timeout 500
#
# Regex rules for multiline parsing
# ---------------------------------
#
# configuration hints:
#
# - first state always has the name: start_state
# - every field in the rule must be inside double quotes
#
# rules | state name | regex pattern | next state
# ------|---------------|--------------------------------------------
rule "start_state" ".*" "cont"
rule "cont" ".*" "cont"
fluent.conf:
[INPUT]
name tail
...
multiline.parser multiline-all

Extract a motif in various url strings with regex in ruby

I have different type of strings (in fact logs):
2022-08-03T16:20:41 - INFO - server.py - 649 - 192.168.1.24,192.168.1.29 - - [03/Aug/2022 16:20:41] "GET /get_customer_by_id/0024-A HTTP/1.0" 200 554 0.007798
2022-08-03T16:20:56 - INFO - utils.py - 10 - GET - http://192.168.1.24/get_customer_by_id/0025-A
2022-08-03T16:21:13 - INFO - utils.py - 10 - POST - http://192.168.1.24/order
I want to extract the customer id in each get_customer_by_id url. So for the previous example, i'm looking for 0024-A and 0025-A
I tried with a regex \/get_result\/(.+) but it gives me all the end of line when there is something after the customer id.
You can have a detail of implementation here: https://rubular.com/r/FgBxR1kUyQAYSl
How can i solve this ?
Thanks a lot for your help !
I suppose you'd be looking for something like /\/get_customer_by_id\/(\S+)/. This will grab all non-whitespace characters (stopping before the HTTP/1.0 on the first line). If you know it's always dddd-s, then you could also use something like /\/get_customer_by_id\/(\d+-\w)/. Either way, it will be in the first capture group (link to info on ruby capture groups).

Regex for "wp-admin" "wp-login" entries in syslog trying on drupal sites

I am looking for a fail2ban regex (or two) to find the wp-admin and wp-login attemps on drupal sites.
The regex should find "drupal:" and "page not found" and ("wp-admin" or "wp-login")
the problem for me are the "and" conditions
The logfile entries:
Apr 7 10:59:23 webserver drupal: https://www.anywebsite.com|1617785962|page not found|123.456.789.112|https://www.anywebsite.com/wp-admin/admin-ajax.php?action=revslider_show_image&img=../wp-config.php|https://anywebsite.com/wp-admin/admin-ajax.php?action=revslider_show_image&img=../wp-config.php|0||wp-admin/admin-ajax.php
Apr 7 06:53:47 webserver drupal: https://www.anywebsite.com|1617771227|page not found|123.456.789.112|https://www.anywebsite.com/wp/wp-login.php||0||wp/wp-login.php
Here you go:
failregex = ^\s*\S+ drupal: [^|]*\|\d+\|(?:page not found)\|<ADDR>
replace <ADDR> with <HOST> for fail2ban versions before v.0.10
WARNING Note that this assumes that first URI in your log-line (site? referrer?) after drupal: never contains a pipe-character (so an intruder is unable to add it to URI somehow to avoid ban). Otherwise it becomes complex (you must anchor it from both sides or write some conditional REs with lookaheads or lookbehinds).
Also note that if your side can make some 404 for legitimate users (because missing some references etc), you have to add to the RE some precise pattern excluding your missing pages to avoid false positives, e. g. something like this (with blacklisting expressions):
_block_uris = wp-admin|(?:wp/)wp-login
failregex = ^\s*\S+ drupal: [^|]*\|\d+\|(?:page not found)\|<ADDR>\|\w+://[^/]+/(?:%(_block_uris)s)
or (with white-listing expressions, here ignoring /my-page/ and my-site/ URIs):
_ignore_uris = my-page/|my-side/
failregex = ^\s*\S+ drupal: [^|]*\|\d+\|(?:page not found)\|<ADDR>\|\w+://[^/]+/(?!%(_ignore_uris)s)

JAVAMETHOD grok pattern with optional thread number at the end

I'm trying to parse log4j messages:
2019-12-02 20:48:20.198utc DEBUG UnknownElementContentHandler,streamLock-9-th-11:32 - blabla
2019-11-19 23:40:04.014utc WARN AnnotationBinder,localhost-startStop-1:611 - blabla
2019-11-19 23:40:04.014utc INFO CovImCtl,main:109 - blabla
with grok pattern
%{TIMESTAMP_ISO8601:timestamp}utc%{SPACE}%{LOGLEVEL:level}%{SPACE}%{JAVACLASS:class},%{JAVAMETHOD1:method}:%{POSINT:lineno}%{SPACE}-%{SPACE}%{GREEDYDATA:message}
with using a variation on the standard:
JAVAMETHOD (?:(<(?:cl)?init>)|[a-zA-Z$_][a-zA-Z$_0-9]*)
JAVAMETHOD1 (?:(<(?:cl)?init>)|[a-zA-Z$_][a-zA-Z$_\-0-9]*)
The JAVAMETHOD worked for "main" but not for the others, (the pattern was missing -).
JAVAMETHOD1 works, but I need to get the optional trailing integer retrieved as a "thread_no" field (11 from streamLock-9-th-11, 1 from localhost-startStop-1)
I'm wrecking my brain, the methods like streamLock-9-th-11 has the internal "-\d+" "-9" which belongs to "streamLock-9-th"
Any ideas?

fail2ban scan for 403 in nginx access logs

I have setup some specific rules on nginx, blocking some urls and some extensions (aspx, sh, jsp, etc..).
I have also enable a custom access log file only for 403|429|410 errors, so that in only 1 place i can have all my access denied log.
My goal is to have fail2ban read this log and for every GET/POST that ends in a 403 error, IP should be banned.
1) nginx.conf will be logging the custom error log file like this:
log_format limit '$time_local - $remote_addr "$request" $status';
and this is a log entry:
03/Jan/2017:15:53:01 +0100 - 1.2.3.4 "GET /aaa.jsp HTTP/1.1" 403
2) i have a fail2ban filter like this (taken from here)
^<HOST> .* "(GET|POST) [^"]+" 403
3) i have tried with fail2ban-regex
fail2ban-regex /var/log/nginx/access-live-limitbot-website.log /etc/fail2ban/filter.d/nginx-403.conf
and this is the output
Results
=======
Failregex: 0 total
Ignoreregex: 0 total
Date template hits:
|- [# of hits] date format
| [1] Day/MONTH/Year:Hour:Minute:Second
`-
Lines: 2 lines, 0 ignored, 0 matched, 2 missed
|- Missed line(s):
| 217.19.158.242 "POST /wp-login.php HTTP/1.1" 403
| 03/Jan/2017:15:53:01 +0100 - 217.19.158.242 "GET /aaa.jsp HTTP/1.1" 403
`-
and i will never get the entry matching the error code.
Will someone please help me with the regex based on my custom log?
thank you
Fail2ban is picky about the date format. Also, for ease of matching, I suggest reordering the items in the log.
For date format, see documentation here:
https://www.fail2ban.org/wiki/index.php/MANUAL_0_8
In order for a log line to match your failregex, it actually has to match in two parts: the beginning of the line has to match a timestamp pattern or regex, and the remainder of the line has to match your failregex. If the failregex is anchored with a leading ^, then the anchor refers to the start of the remainder of the line, after the timestamp and intervening whitespace.
The pattern or regex to match the time stamp is currently not documented, and not available for users to read or set. See Debian bug #491253. This is a problem if your log has a timestamp format that fail2ban doesn't expect, since it will then fail to match any lines. Because of this, you should test any new failregex against a sample log line, as in the examples below, to be sure that it will match. If fail2ban doesn't recognize your log timestamp, then you have two options: either reconfigure your daemon to log with a timestamp in a more common format, such as in the example log line above; or file a bug report asking to have your timestamp format included.
For the reorder, something like datetime - status - host (- other stuff), would help create a simple pattern such as 403.
Therefore your log should look like:
03-01-2017 15:53:01 403 1.2.3.4 "GET /aaa.jsp HTTP/1.1"
and your pattern can be
403 <HOST>
You can run this from the command line to validate as:
fail2ban-regex '03-01-2017 15:53:01 403 1.2.3.4 "GET /aaa.jsp HTTP/1.1"' '403 <HOST>'
Which produces the output:
Running tests
=============
Use regex line : 403 <HOST>
Use single line: 03-01-2017 15:53:01 403 1.2.3.4 "GET /aaa.jsp HTTP...
Matched time template Day-Month-Year Hour:Minute:Second
Got time using template Day-Month-Year Hour:Minute:Second
Results
=======
Failregex: 1 total
|- #) [# of hits] regular expression
| 1) [1] 403 <HOST>
`-
Ignoreregex: 0 total
Summary
=======
Addresses found:
[1]
1.2.3.4 (Tue Jan 03 15:53:01 2017)
Date template hits:
2 hit(s): Day-Month-Year Hour:Minute:Second
Success, the total number of match is 1