grok regex parsing not matching a log. when specifying a group as optional, but not the last group - regex

Example:
info: 2014-10-28T22:39:46.593Z - info: an error occurred while trying
to handle command: PlaceMarketOrderCommand, xkkdAAGRIl. Error:
Insufficient Cash #userId=5 #orderId=Y5545
pattern:
> %{LOGLEVEL:stream_level}: %{TIMESTAMP_ISO8601:timestamp} -
> %{LOGLEVEL:log_level}: %{MESSAGE:message}
> (#userId=%{USER_ID:user_id})? (#orderId=%{ORDER_ID:order_id})?
extra patterns used:
USER_ID (\d+|None)
ORDER_ID .*
ORDER_ID_HASH \s*(#orderId=%{ORDER_ID:order_id})?
USER_ID_HASH \s*(#userId=%{USER_ID:user_id})?
MESSAGE (.*?)
Works fine:
removing the optional last orderId also works
info: 2014-10-28T22:39:46.593Z - info: an error occurred while trying
to handle command: PlaceMarketOrderCommand, xkkdAAGRIl. Error:
Insufficient Cash #userId=5
but if I keep the orderId and remove the userId then I get a "no match"
info: 2014-10-28T22:39:46.593Z - info: an error occurred while trying
to handle command: PlaceMarketOrderCommand, xkkdAAGRIl. Error:
Insufficient Cash #orderId=Y5545
Also the user_id group is ending with a ? as an optional group..
working with the grok debugger in heroku:
Is this a bug? (logstash 1.4.2) missing something with the regex? (more probable.. but what?)
I looked at the regex lib grok is using and looks this syntax supposed to work. It does work for the last group (orderId) but not for the one before..
Thanks for the help!

You are forcing a space to be before your optional last... you need to do ?:
%{LOGLEVEL:stream_level}: %{TIMESTAMP_ISO8601:timestamp} -> %{LOGLEVEL:log_level}: %{MESSAGE:message} ?(#userId=%{USER_ID:user_id})? ?(#orderId=%{ORDER_ID:order_id})?

Related

Multiple regex matching in filebeat for message field

I want to apply 2 regex expression with filebeat to drop events matching the content in message field.
I am able to make it work for single regex condition, but I am not sure how to configure multiple regex conditions.
regex list:
message: "(?i)cron"
message: "^now ([0-9]{4})-([0-1][0-9])-([0-3][0-9])\s([0-1][0-9]|[2][0-3]):([0-5][0-9]):([0-5][0-9])$"
Following is the config I have done for single regex which will match "cron" case insensitive text anywhere in the message
- drop_event:
when:
regexp:
message: "(?i)cron"
Refering to the Filebeat docs, I tried multiple configs but then filebeat won't startup:
Try 1:
- drop_event:
or:
- regexp:
message: "(?i)cron"
- regexp:
message: "^now ([0-9]{4})-([0-1][0-9])-([0-3][0-9])\s([0-1][0-9]|[2][0-3]):([0-5][0-9]):([0-5][0-9])$"
Try 2:
- if:
regexp:
message: "(?i)cron"
then:
drop_event:
- if:
regexp:
message: "^now ([0-9]{4})-([0-1][0-9])-([0-3][0-9])\s([0-1][0-9]|[2][0-3]):([0-5][0-9]):([0-5][0-9])$"
then:
drop_event:
Figured Out How we can apply multiple filter using or operator in filebeat. I was close in the second attempt in the post. When is required, after that we can use whatever operator we like or and etc.
Here's example of how I am using it
processors:
- drop_event.when:
or:
- contains:
container.name: "nginx"
- contains:
container.name: "mongo"
- contains:
container.name: "mysql"
- contains:
container.name: "redis"
- equals:
container.name: "tecnativa/tcp-proxy"
- drop_event.when:
or:
- regexp:
message: "(?i)cron"
- regexp:
message: "In On Child added message"
- regexp:
message: "In on Child removed message"
- regexp:
message: "then Moment"
- regexp:
message: "call_duration"
- regexp:
message: "now Moment"
- regexp:
message: "CHAT NOTIFICATION CODE"

Create a cakephp filter for fail2ban

i would like to create a filter in fail2ban for searching and blocking bad request like "Controller class * could not be found."
For this problem i was create a cakephp.conf file in the filter.d directory in fail2ban. The Content:
[Definition]
failregex = ^[0-9]{4}\-[0-9]{2}\-[0-9]{2}.*Error:.*\nStack Trace:\n(\-.*|\n)*\n.*\n.*\nClient IP: <HOST>\n$
ignoreregex =
My example error log looks like this:
...
2020-10-08 19:59:46 Error: [Cake\Http\Exception\MissingControllerException] Controller class Webfig could not be found. in /home/myapplication/htdocs/vendor/cakephp/cakephp/src/Controller/ControllerFactory.php on line 158
Stack Trace:
- /home/myapplication/htdocs/vendor/cakephp/cakephp/src/Controller/ControllerFactory.php:46
- /home/myapplication/htdocs/vendor/cakephp/cakephp/src/Http/BaseApplication.php:249
- /home/myapplication/htdocs/vendor/cakephp/cakephp/src/Http/Runner.php:77
- /home/myapplication/htdocs/vendor/cakephp/authentication/src/Middleware/AuthenticationMiddleware.php:122
- /home/myapplication/htdocs/vendor/cakephp/cakephp/src/Http/Runner.php:73
- /home/myapplication/htdocs/vendor/cakephp/cakephp/src/Http/Runner.php:77
- /home/myapplication/htdocs/vendor/cakephp/cakephp/src/Http/Middleware/CsrfProtectionMiddleware.php:146
- /home/myapplication/htdocs/vendor/cakephp/cakephp/src/Http/Runner.php:73
- /home/myapplication/htdocs/vendor/cakephp/cakephp/src/Http/Runner.php:58
- /home/myapplication/htdocs/vendor/cakephp/cakephp/src/Routing/Middleware/RoutingMiddleware.php:172
- /home/myapplication/htdocs/vendor/cakephp/cakephp/src/Http/Runner.php:73
- /home/myapplication/htdocs/vendor/cakephp/cakephp/src/Routing/Middleware/AssetMiddleware.php:68
- /home/myapplication/htdocs/vendor/cakephp/cakephp/src/Http/Runner.php:73
- /home/myapplication/htdocs/vendor/cakephp/cakephp/src/Error/Middleware/ErrorHandlerMiddleware.php:121
- /home/myapplication/htdocs/vendor/cakephp/cakephp/src/Http/Runner.php:73
- /home/myapplication/htdocs/vendor/cakephp/cakephp/src/Http/Runner.php:58
- /home/myapplication/htdocs/vendor/cakephp/cakephp/src/Http/Server.php:90
- /home/myapplication/htdocs/webroot/index.php:40
Request URL: /webfig/
Referer URL: http://X.X.X.X/webfig/
Client IP: X.X.X.X
...
X.X.X.X are replaced
But i can't match any ip adresses. The fail2ban tester says:
root#test:~# fail2ban-regex /home/myapplication/htdocs/logs/error.log /etc/fail2ban/filter.d/cakephp.conf
Running tests
=============
Use failregex filter file : cakephp, basedir: /etc/fail2ban
Use log file : /home/myapplication/htdocs/logs/error.log
Use encoding : UTF-8
Results
=======
Failregex: 0 total
Ignoreregex: 0 total
Date template hits:
|- [# of hits] date format
| [719] {^LN-BEG}ExYear(?P<_sep>[-/.])Month(?P=_sep)Day(?:T| ?)24hour:Minute:Second(?:[.,]Microseconds)?(?:\s*Zone offset)?
`-
Lines: 15447 lines, 0 ignored, 0 matched, 15447 missed
[processed in 10.02 sec]
Missed line(s): too many to print. Use --print-all-missed to print all 15447 lines
i can't see any problems. Can you help me? :)
Thanks
The issue is your log is poor suitable to parse - it is a multiline log-file (IP takes place in other line as the failure message).
Let alone the line with IP does not has any ID (common information with line of failure), it can be still worse if several messages are crossing (so Client IP from other message that is not a failure, coming after failure message).
If you can change the log-format better do that (so date, IP and failure sign are in the same line), e.g. if you use nginx, organize a conditional logging for access log from php-location in error case like this.
See Fail2ban :: wiki :: Best practice for more info.
If you cannot do that (well better would be to change it), you can use multi-line buffering and parsing using maxlines parameter and <SKIPLINES> regex.
Your filter would be something like that:
[Definition]
# we ignore stack trace, so don't need to hold buffer window too large,
# 5 would be enough, but to be sure (if some log-messages crossing):
maxlines = 10
ignoreregex = ^(?:Stack |- /)
failregex = ^\s+Error: \[[^\]]+\] Controller class \S+ could not be found\..*<SKIPLINES>^((?:Request|Referer) URL:.*<SKIPLINES>)*^Client IP: <HOST>
To test it directly use:
fail2ban-regex --maxlines=5 /path/to/log '^\s+Error: \[[^\]]+\] Controller class \S+ could not be found\..*<SKIPLINES>^((?:Request|Referer) URL:.*<SKIPLINES>)*^Client IP: <HOST>' '^(?:Stack |- /)'
But as already said, it is really ugly - better you find the way to log everything in a single line.

JAVAMETHOD grok pattern with optional thread number at the end

I'm trying to parse log4j messages:
2019-12-02 20:48:20.198utc DEBUG UnknownElementContentHandler,streamLock-9-th-11:32 - blabla
2019-11-19 23:40:04.014utc WARN AnnotationBinder,localhost-startStop-1:611 - blabla
2019-11-19 23:40:04.014utc INFO CovImCtl,main:109 - blabla
with grok pattern
%{TIMESTAMP_ISO8601:timestamp}utc%{SPACE}%{LOGLEVEL:level}%{SPACE}%{JAVACLASS:class},%{JAVAMETHOD1:method}:%{POSINT:lineno}%{SPACE}-%{SPACE}%{GREEDYDATA:message}
with using a variation on the standard:
JAVAMETHOD (?:(<(?:cl)?init>)|[a-zA-Z$_][a-zA-Z$_0-9]*)
JAVAMETHOD1 (?:(<(?:cl)?init>)|[a-zA-Z$_][a-zA-Z$_\-0-9]*)
The JAVAMETHOD worked for "main" but not for the others, (the pattern was missing -).
JAVAMETHOD1 works, but I need to get the optional trailing integer retrieved as a "thread_no" field (11 from streamLock-9-th-11, 1 from localhost-startStop-1)
I'm wrecking my brain, the methods like streamLock-9-th-11 has the internal "-\d+" "-9" which belongs to "streamLock-9-th"
Any ideas?

Fail2Ban fails to ban Asterisk Errors

I have fail2ban 0.9.1 with Asterisk 11 on Fedora 21 using IPTables.
The IP addresses that attack my server are not getting written to IP Tables automatically (see below about them working when manually running banip). Do you see any errors that would be causing this?
I get messages in my /var/log/asterisk/messages log about miscreants trying erroneous extensions.
My Regex works because when I run
fail2ban-regex /var/log/asterisk/messages /etc/fail2ban/filter.d/asterisk.conf
I get
Lines: 2985 lines, 0 ignored, 597 matched, 2388 missed [processed in 0.66 sec]
This means that 597 lines matched the regular expression. Right? Is there a way to show what lines were matched? and what the variables were?
I can also do:
fail2ban-client set asterisk banip 107.150.44.222
and IPTables is properly updated and the IP is banned. (Yes, I know I used a real IP address -- and as far as I am concerned everyone is welcome to ban the ba$%*$#rd)
jail.local
[asterisk]
enabled=yes
filter=asterisk
protocol=all
logpath = /var/log/asterisk/messages
banaction=iptables-multiport
port = 5060,5061
action = %(banaction)s[name=%(__name__)s-tcp, port="%(port)s", protocol="tcp", chain="%(chain)s", actname=%(banaction)s-tcp]
%(banaction)s[name=%(__name__)s-udp, port="%(port)s", protocol="udp", chain="%(chain)s", actname=%(banaction)s-udp]
%(mta)s-whois[name=%(__name__)s, dest="%(destemail)s"]
maxretry = 3
bantime=432000
findtime =86400
I removed the reference to Asterisk in jail.conf to avoid conflicts
filter.d/asterisk.conf
[INCLUDES]
# Read common prefixes. If any customizations available -- read them from
# common.local
before = common.conf
[Definition]
# Option: failregex
# Notes.: regex to match the password failures messages in the logfile.
# Values: TEXT
#
log_prefix= \[\]\s*(?:NOTICE|SECURITY)%(__pid_re)s:?(?:\[\S+\d*\])? \S+:\d*
failregex = ^%(log_prefix)s Registration from '[^']*' failed for '<HOST>(:\d+)?' - Wrong password$
^%(log_prefix)s Registration from '[^']*' failed for '<HOST>(:\d+)?' - No matching peer found$
^%(log_prefix)s Registration from '[^']*' failed for '<HOST>(:\d+)?' - Username/auth name mismatch$
^%(log_prefix)s Registration from '[^']*' failed for '<HOST>(:\d+)?' - Device does not match ACL$
^%(log_prefix)s Registration from '[^']*' failed for '<HOST>(:\d+)?' - Peer is not supposed to register$
^%(log_prefix)s Registration from '[^']*' failed for '<HOST>(:\d+)?' - ACL error \(permit/deny\)$
^%(log_prefix)s Registration from '[^']*' failed for '<HOST>(:\d+)?' - Not a local domain$
^%(log_prefix)s Call from '[^']*' \(<HOST>:\d+\) to extension '\d+' rejected because extension not found in context 'default'\.$
^%(log_prefix)s Host <HOST> failed to authenticate as '[^']*'$
^%(log_prefix)s No registration for peer '[^']*' \(from <HOST>\)$
^%(log_prefix)s Host <HOST> failed MD5 authentication for '[^']*' \([^)]+\)$
^%(log_prefix)s Failed to authenticate (user|device) [^#]+#<HOST>\S*$
^%(log_prefix)s (?:handle_request_subscribe: )?Sending fake auth rejection for (device|user) \d*<sip:[^#]+#<HOST>>;tag=\w+\S*$
^%(log_prefix)s SecurityEvent="(FailedACL|InvalidAccountID|ChallengeResponseFailed|InvalidPassword)",EventTV="[\d-]+",Severity="[\w]+",Service="[\w]+",EventVersion="\d+",AccountID="\d+",SessionID="0x[\da-f]+",LocalAddress="IPV[46]/(UD|TC)P/[\da-fA-F:.]+/\d+",RemoteAddress="IPV[46]/(UD|TC)P/(<HOST>)/[0-9]{4}"(,Challenge="\w+",ReceivedChallenge="\w+")?(,ReceivedHash="[\da-f]+")?$
# Option: ignoreregex
# Notes.: regex to ignore. If this regex matches, the line is ignored.
# Values: TEXT
#
ignoreregex =
Your asterisk.conf and jail.local entry look fine, though I typically add the jail name after the banaction. For example: banaction=iptables-multiport[name=asterisk]
Restart the fail2ban service and check your fail2ban log for any errors. A common one that didn't get fixed until v0.9.2 is:
Error in FilterPyinotify callback: 'module' object has no attribute '_strptime_time'
To fix it, update fail2ban to v0.9.2 or edit the file: /usr/share/fail2ban/common/__init__.py
and add the following text to the end of the file:
from time import strptime
# strptime thread safety hack-around - http://bugs.python.org/issue7980
strptime("2012", "%Y")
Is there a way to show what lines were matched? and what the variables were?
You'll want to use the -v option with fail2ban-regex. It won't give you matched variables, but will list each IP Address associated with the matched line. You can then examine details for that IP in your asterisk logs.
fail2ban-regex -v /var/log/asterisk/messages /etc/fail2ban/filter.d/asterisk.conf

"YYYYMMDD": Invalid identifier error while trying through SQOOP

Please help me out from the below error.It works fine when checked in oracle but fails when trying through SQOOP import.
version : Hadoop 0.20.2-cdh3u4 and Sqoop 1.3.0-cdh3u5
sqoop import $SQOOP_CONNECTION_STRING
--query 'SELECT st.reference,u.unit,st.reading,st.code,st.read_id,st.avg FROM reading st,tunit `tu,unit u
WHERE st.reference=tu.reference and st.number IN ('218730','123456') and tu.unit_id = u.unit_id
and u.enrolled='Y' AND st.reading <= latest_off and st.reading >= To_Date('20120701','yyyymmdd')
and st.type_id is null and $CONDITIONS'
--split-by u.unit
--target-dir /sample/input
Error:
12/10/10 09:33:21 ERROR manager.SqlManager: Error executing statement:
java.sql.SQLSyntaxErrorException: ORA-00904: "YYYYMMDD": invalid identifier
followed by....
12/10/10 09:33:21 ERROR sqoop.Sqoop: Got exception running Sqoop:
java.lang.NullPointerException
Thanks & Regards,
Tamil
I believe that the problem is actually on Bash side (or your command line interpret). Your query contains for example following fragment u.enrolled='Y'. Please notice that you're escaping character constants with single quotes. You seem to be putting entire query into additional single quotes: --query 'YOUR QUERY'. Which results in something like --query '...u.enrolled='Y'...'. However such string is stripped by bash to '...u.enrolled=Y...'. You can verify that by using "echo" to see what exactly will bash do with your string before it will be passed to Sqoop.
jarcec#jarcec-thinkpad ~ % echo '...u.enrolled='Y'...'
...u.enrolled=Y..
.
I would recommend to either escape all single quotes (\') inside your query or choose double quotes for entire query. Please note that the later option will require escaping $ characters with backslash (\$).