Logstash pattern matching more than it should - regex

I have a log line (multiline match) that looks like the following:
3574874 14/Jul/2016 20:42:37 +0000 ERROR [http-bio-0.0.0.0-8443-exec-128] error_jsp _jspService > could not lock: [com.myCompany.myProject.bean.scheduling.Area#6306]; SQL [/* UPGRADE lock com.myCompany.myProject.bean.scheduling.Area */ select Area_id from Area_ID where Area_id =? and Area_Version =? for update]; nested exception is org.hibernate.exception.LockAcquisitionException: could not lock: [com.myCompany.myProject.bean.scheduling.Area#6306]
org.springframework.dao.CannotAcquireLockException: could not lock: [com.myCompany.myProject.bean.scheduling.MyClass#6306]; SQL [/* UPGRADE lock com.myCompany.myProject.bean.scheduling.MyClass */ select Area_ID from Area where Area =? and Area =? for update]; nested exception is org.hibernate.exception.LockAcquisitionException: could not lock: [com.myCompany.myProject.bean.scheduling.Area#6306]
at org.springframework.orm.hibernate3.SessionFactoryUtils.convertHibernateAccessException(SessionFactoryUtils.java:639)
at org.springframework.orm.hibernate3.HibernateExceptionTranslator.convertHibernateAccessException(HibernateExceptionTranslator.java:89)
at org.springframework.orm.hibernate3.HibernateExceptionTranslator.translateExceptionIfPossible(HibernateExceptionTranslator.java:68)
at org.springframework.dao.support.ChainedPersistenceExceptionTranslator.translateExceptionIfPossible(ChainedPersistenceExceptionTranslator.java:58)
at org.springframework.dao.support.DataAccessUtils.translateIfNecessary(DataAccessUtils.java:213)
at org.springframework.dao.support.PersistenceExceptionTranslationInterceptor.invoke(PersistenceExceptionTranslationInterceptor.java:163)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:633)
at com.myCompany.myProject.dal.hibernate.Impl$$EnhancerBySpringCGLIB$$535be625.lock(<generated>)
at com.myCompany.myProject.scheduling.Area.AreaClose(MyClass.java:1265)
at com.myCompany.myProject.scheduling.Area.handleProviderDisconnect(MyClass.java:1190)
at com.myCompany.myProject.scheduling.Area$$FastClassBySpringCGLIB$$220c3d67.invoke(<generated>)
at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204)
at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:700)
My pattern file lays things out like this:
# AW Tomcat formatting
TOMCAT_DATE %{MONTHDAY}[./-]%{MONTH}[./-]%{YEAR}
TOMCAT_TS %{BASE10NUM}
#%{BASE10NUM}
TOMCAT_TIME %{HOUR}:%{MINUTE}(?::%{SECOND})(?![0-9])
TOMCAT_THREAD \[(.+?)\]
TOMCAT_CLASS [A-Za-z0-9]+
TOMCAT_CLASS_METHOD %{NOTSPACE}
TOMCAT_TIMESTAMP %{TOMCAT_DATE}[\s]+%{TOMCAT_TIME}
TOMCAT_MESSAGE .+
TOMCAT_IP %{IP}
TOMCAT_DATA %{NOTSPACE}
TOMCAT_LOG (?:%{TOMCAT_TS:log_ts})[\s]+(?:%{TOMCAT_TIMESTAMP:log_timestamp})[\s]+%{INT}[\s]+(?:%{LOGLEVEL:log_level})[\s]+(?:%{TOMCAT_THREAD:thread})[\s]+(?:%{TOMCAT_CLASS:class_name})[\s]$
When I look at this in logstash it seems like the code for thread is matching a lot more than just the thread... it's matching all the way to [com.myCompany.myProject.bean.scheduling.MyClass#6306];
When I take JUST the thread and the regex for space after and toss it on regexr or regex101 I can't recreate this using:
\[(.+?)\][\s]+
Does anyone have insight on why the regex is working everywhere but logstash? Also, this works on about 99.5% of my incoming tomcat logs as well....

You have a + in your log. You need to match it with \+ or make it optional with \+?:
TOMCAT_LOG (?:%{TOMCAT_TS:log_ts})\s+(?:%{TOMCAT_TIMESTAMP:log_timestamp})\s+\+?%{INT}\s+(?:%{LOGLEVEL:log_level})\s+(?:%{TOMCAT_THREAD:thread})\s+(?:%{TOMCAT_CLASS:class_name})\s*$
^^^

Related

Regex for "wp-admin" "wp-login" entries in syslog trying on drupal sites

I am looking for a fail2ban regex (or two) to find the wp-admin and wp-login attemps on drupal sites.
The regex should find "drupal:" and "page not found" and ("wp-admin" or "wp-login")
the problem for me are the "and" conditions
The logfile entries:
Apr 7 10:59:23 webserver drupal: https://www.anywebsite.com|1617785962|page not found|123.456.789.112|https://www.anywebsite.com/wp-admin/admin-ajax.php?action=revslider_show_image&img=../wp-config.php|https://anywebsite.com/wp-admin/admin-ajax.php?action=revslider_show_image&img=../wp-config.php|0||wp-admin/admin-ajax.php
Apr 7 06:53:47 webserver drupal: https://www.anywebsite.com|1617771227|page not found|123.456.789.112|https://www.anywebsite.com/wp/wp-login.php||0||wp/wp-login.php
Here you go:
failregex = ^\s*\S+ drupal: [^|]*\|\d+\|(?:page not found)\|<ADDR>
replace <ADDR> with <HOST> for fail2ban versions before v.0.10
WARNING Note that this assumes that first URI in your log-line (site? referrer?) after drupal: never contains a pipe-character (so an intruder is unable to add it to URI somehow to avoid ban). Otherwise it becomes complex (you must anchor it from both sides or write some conditional REs with lookaheads or lookbehinds).
Also note that if your side can make some 404 for legitimate users (because missing some references etc), you have to add to the RE some precise pattern excluding your missing pages to avoid false positives, e. g. something like this (with blacklisting expressions):
_block_uris = wp-admin|(?:wp/)wp-login
failregex = ^\s*\S+ drupal: [^|]*\|\d+\|(?:page not found)\|<ADDR>\|\w+://[^/]+/(?:%(_block_uris)s)
or (with white-listing expressions, here ignoring /my-page/ and my-site/ URIs):
_ignore_uris = my-page/|my-side/
failregex = ^\s*\S+ drupal: [^|]*\|\d+\|(?:page not found)\|<ADDR>\|\w+://[^/]+/(?!%(_ignore_uris)s)

Create a cakephp filter for fail2ban

i would like to create a filter in fail2ban for searching and blocking bad request like "Controller class * could not be found."
For this problem i was create a cakephp.conf file in the filter.d directory in fail2ban. The Content:
[Definition]
failregex = ^[0-9]{4}\-[0-9]{2}\-[0-9]{2}.*Error:.*\nStack Trace:\n(\-.*|\n)*\n.*\n.*\nClient IP: <HOST>\n$
ignoreregex =
My example error log looks like this:
...
2020-10-08 19:59:46 Error: [Cake\Http\Exception\MissingControllerException] Controller class Webfig could not be found. in /home/myapplication/htdocs/vendor/cakephp/cakephp/src/Controller/ControllerFactory.php on line 158
Stack Trace:
- /home/myapplication/htdocs/vendor/cakephp/cakephp/src/Controller/ControllerFactory.php:46
- /home/myapplication/htdocs/vendor/cakephp/cakephp/src/Http/BaseApplication.php:249
- /home/myapplication/htdocs/vendor/cakephp/cakephp/src/Http/Runner.php:77
- /home/myapplication/htdocs/vendor/cakephp/authentication/src/Middleware/AuthenticationMiddleware.php:122
- /home/myapplication/htdocs/vendor/cakephp/cakephp/src/Http/Runner.php:73
- /home/myapplication/htdocs/vendor/cakephp/cakephp/src/Http/Runner.php:77
- /home/myapplication/htdocs/vendor/cakephp/cakephp/src/Http/Middleware/CsrfProtectionMiddleware.php:146
- /home/myapplication/htdocs/vendor/cakephp/cakephp/src/Http/Runner.php:73
- /home/myapplication/htdocs/vendor/cakephp/cakephp/src/Http/Runner.php:58
- /home/myapplication/htdocs/vendor/cakephp/cakephp/src/Routing/Middleware/RoutingMiddleware.php:172
- /home/myapplication/htdocs/vendor/cakephp/cakephp/src/Http/Runner.php:73
- /home/myapplication/htdocs/vendor/cakephp/cakephp/src/Routing/Middleware/AssetMiddleware.php:68
- /home/myapplication/htdocs/vendor/cakephp/cakephp/src/Http/Runner.php:73
- /home/myapplication/htdocs/vendor/cakephp/cakephp/src/Error/Middleware/ErrorHandlerMiddleware.php:121
- /home/myapplication/htdocs/vendor/cakephp/cakephp/src/Http/Runner.php:73
- /home/myapplication/htdocs/vendor/cakephp/cakephp/src/Http/Runner.php:58
- /home/myapplication/htdocs/vendor/cakephp/cakephp/src/Http/Server.php:90
- /home/myapplication/htdocs/webroot/index.php:40
Request URL: /webfig/
Referer URL: http://X.X.X.X/webfig/
Client IP: X.X.X.X
...
X.X.X.X are replaced
But i can't match any ip adresses. The fail2ban tester says:
root#test:~# fail2ban-regex /home/myapplication/htdocs/logs/error.log /etc/fail2ban/filter.d/cakephp.conf
Running tests
=============
Use failregex filter file : cakephp, basedir: /etc/fail2ban
Use log file : /home/myapplication/htdocs/logs/error.log
Use encoding : UTF-8
Results
=======
Failregex: 0 total
Ignoreregex: 0 total
Date template hits:
|- [# of hits] date format
| [719] {^LN-BEG}ExYear(?P<_sep>[-/.])Month(?P=_sep)Day(?:T| ?)24hour:Minute:Second(?:[.,]Microseconds)?(?:\s*Zone offset)?
`-
Lines: 15447 lines, 0 ignored, 0 matched, 15447 missed
[processed in 10.02 sec]
Missed line(s): too many to print. Use --print-all-missed to print all 15447 lines
i can't see any problems. Can you help me? :)
Thanks
The issue is your log is poor suitable to parse - it is a multiline log-file (IP takes place in other line as the failure message).
Let alone the line with IP does not has any ID (common information with line of failure), it can be still worse if several messages are crossing (so Client IP from other message that is not a failure, coming after failure message).
If you can change the log-format better do that (so date, IP and failure sign are in the same line), e.g. if you use nginx, organize a conditional logging for access log from php-location in error case like this.
See Fail2ban :: wiki :: Best practice for more info.
If you cannot do that (well better would be to change it), you can use multi-line buffering and parsing using maxlines parameter and <SKIPLINES> regex.
Your filter would be something like that:
[Definition]
# we ignore stack trace, so don't need to hold buffer window too large,
# 5 would be enough, but to be sure (if some log-messages crossing):
maxlines = 10
ignoreregex = ^(?:Stack |- /)
failregex = ^\s+Error: \[[^\]]+\] Controller class \S+ could not be found\..*<SKIPLINES>^((?:Request|Referer) URL:.*<SKIPLINES>)*^Client IP: <HOST>
To test it directly use:
fail2ban-regex --maxlines=5 /path/to/log '^\s+Error: \[[^\]]+\] Controller class \S+ could not be found\..*<SKIPLINES>^((?:Request|Referer) URL:.*<SKIPLINES>)*^Client IP: <HOST>' '^(?:Stack |- /)'
But as already said, it is really ugly - better you find the way to log everything in a single line.

JAVAMETHOD grok pattern with optional thread number at the end

I'm trying to parse log4j messages:
2019-12-02 20:48:20.198utc DEBUG UnknownElementContentHandler,streamLock-9-th-11:32 - blabla
2019-11-19 23:40:04.014utc WARN AnnotationBinder,localhost-startStop-1:611 - blabla
2019-11-19 23:40:04.014utc INFO CovImCtl,main:109 - blabla
with grok pattern
%{TIMESTAMP_ISO8601:timestamp}utc%{SPACE}%{LOGLEVEL:level}%{SPACE}%{JAVACLASS:class},%{JAVAMETHOD1:method}:%{POSINT:lineno}%{SPACE}-%{SPACE}%{GREEDYDATA:message}
with using a variation on the standard:
JAVAMETHOD (?:(<(?:cl)?init>)|[a-zA-Z$_][a-zA-Z$_0-9]*)
JAVAMETHOD1 (?:(<(?:cl)?init>)|[a-zA-Z$_][a-zA-Z$_\-0-9]*)
The JAVAMETHOD worked for "main" but not for the others, (the pattern was missing -).
JAVAMETHOD1 works, but I need to get the optional trailing integer retrieved as a "thread_no" field (11 from streamLock-9-th-11, 1 from localhost-startStop-1)
I'm wrecking my brain, the methods like streamLock-9-th-11 has the internal "-\d+" "-9" which belongs to "streamLock-9-th"
Any ideas?

expect regex to search a complete line or text

my expect_output(buffer) is as below in multiple lines.
status QM1
QM(QM1) Status(Running)
CPU: 0.02%
Memory: 106MB
Queue manager file system: 391MB used, 47.3GB allocated
[1%]
HA role: Primary
HA status: Normal
HA control: Enabled
HA preferred location: Here
mqa(mqcli)#
tried multiple regex options, but i keep getting 0, as no match.
regexp /^.*\b(Running)\b.*$ $qmgrstat similar, get the value of "HA Status".
What would be the correct Regex syntax.
Not sure I understand. Do you only want to match the value of HA status? If yes, then this regex would do that:
HA status:\W*(.*)
If you want to capture the HA status when the "QM" status is Running, you can do
if {[string match {*Status(Running)*} $qmgrstat]} {
regexp {HA status:\s+(\S+)} $qmgrstat -> status
}
You can use the following regex to get the whole output:
status[^\v]*\vQM[^\v]*\vCPU:[^\v]*\vMemory:[^\v]*\vQueue manager file system:[^\v]*\vHA role:[^\v]*\vHA status:[^\v]*\vHA control:[^\v]*\vHA preferred location:[^\v]*\vmqa[^\v]*#\v
if you want the complete line of text with the status use: [^\v]*Status\([^\v]*
To get the line with the HA status you can use: HA status:[^\v]*
Good luck!

grok regex parsing not matching a log. when specifying a group as optional, but not the last group

Example:
info: 2014-10-28T22:39:46.593Z - info: an error occurred while trying
to handle command: PlaceMarketOrderCommand, xkkdAAGRIl. Error:
Insufficient Cash #userId=5 #orderId=Y5545
pattern:
> %{LOGLEVEL:stream_level}: %{TIMESTAMP_ISO8601:timestamp} -
> %{LOGLEVEL:log_level}: %{MESSAGE:message}
> (#userId=%{USER_ID:user_id})? (#orderId=%{ORDER_ID:order_id})?
extra patterns used:
USER_ID (\d+|None)
ORDER_ID .*
ORDER_ID_HASH \s*(#orderId=%{ORDER_ID:order_id})?
USER_ID_HASH \s*(#userId=%{USER_ID:user_id})?
MESSAGE (.*?)
Works fine:
removing the optional last orderId also works
info: 2014-10-28T22:39:46.593Z - info: an error occurred while trying
to handle command: PlaceMarketOrderCommand, xkkdAAGRIl. Error:
Insufficient Cash #userId=5
but if I keep the orderId and remove the userId then I get a "no match"
info: 2014-10-28T22:39:46.593Z - info: an error occurred while trying
to handle command: PlaceMarketOrderCommand, xkkdAAGRIl. Error:
Insufficient Cash #orderId=Y5545
Also the user_id group is ending with a ? as an optional group..
working with the grok debugger in heroku:
Is this a bug? (logstash 1.4.2) missing something with the regex? (more probable.. but what?)
I looked at the regex lib grok is using and looks this syntax supposed to work. It does work for the last group (orderId) but not for the one before..
Thanks for the help!
You are forcing a space to be before your optional last... you need to do ?:
%{LOGLEVEL:stream_level}: %{TIMESTAMP_ISO8601:timestamp} -> %{LOGLEVEL:log_level}: %{MESSAGE:message} ?(#userId=%{USER_ID:user_id})? ?(#orderId=%{ORDER_ID:order_id})?