Mod Security to only auditlog critical rule events - mod-security

Im trying to get mod security 2.7.4 to only audit log when a critical rule match has been found.
I am using Anomaly Scoring, so in the event 1 warning has been matched, do not log the event as it will be allowed. If 2 warnings are matched thus making the score to high, it would trigger the inbound_anomaly_score rule which in turn would deny the request. I would like it to log this event, with the warnings that contributed to the triggering.
Only need it to do it across phase 1 & 2.
Currently using the default value for SecDefaultAction.
These are the current settings for the Audit Log. SecAuditLogRelevantStatus still logs response codes of 200.
SecAuditEngine RelevantOnly
SecAuditLogRelevantStatus "^(?:5|4\d[^4])"
SecAuditLogType Serial
Any help would be greatly appreciated.

Related

AWS CloudWatch: Repeat Alert Notification after 24h of Alertstate

i created some AWS CW Alerts which have typically a time periode of 1 hour / 1 Datapoint. By occuring an Alert our Service Team has been notificated. During a "normal" workday, someone cares about it and do the work of resetting some programms etc. But it also happens, no one have time or sense to care and the alert keeps in the alert state.
Now i want to repeat the alert if there wasn't any state-change in the past 24 hours. It is possible? I still does not find the "easy" answer.
Thx!
EDIT:
Added a "daily_occurence_alert" which is controlled by eventrules / time control. An additional alert for each observed Alert combined with an AND serves good.
It is a workaround, not a solution. Hope this feature will be added as a standard in future.

GCP Alert Filters Don't Affect Open Incidents

I have an alert that I have configured to send email when the sum of executions of cloud functions that have finished in status other than 'error' or 'ok' is above 0 (grouped by the function name).
The way I defined the alert is:
And the secondary aggregator is delta.
The problem is that once the alert is open, it looks like the filters don't matter any more, and the alert stays open because it sees that the cloud function is triggered and finishes with any status (even 'ok' status keeps it open as long as its triggered enough).
ATM the only solution I can think of is to define a log based metric that will count it itself and then the alert will be based on that custom metric instead of on the built in one.
Is there something that I'm missing?
Edit:
Adding another image to show what I think might be the problem:
From the image above we see that the graph wont go down to 0 but will stay at 1, which is not the way other normal incidents work
According to the official documentation:
"Monitoring automatically closes an incident when it observes that the condition is no longer met or when 7 days have passed without an observation that the condition is still being met."
That made me think that there are times where the condition is not relevant to make it close the incident. Which is confirmed here:
"If measurements are missing (for example, if there are no HTTP requests for a couple of minutes), the policy uses the last recorded value to evaluate conditions."
The lack of HTTP requests aren't a reason to close the metric as it keeps using the last recorded value (that triggered the metric).
So, using alerts for Http Requests is fine but you need to close them by yourself. Although I think it would be better to use a custom metric instead if you want them to be disabled automatically.

How can I filter out errors on sentry to avoid consuming my quota?

I'm using Sentry to log my errors, but there are errors I'm not able to fix (or could not be fixed by me) like
OSError (write error)
Or error that come from RQ (each time I deploy my app)
Or client errors (which are client.errors)
I can't just ignore them because I consume all my quota. How I can filter out this errors?
Here some references for interested people.
uwsgi: OSError: write error during GET request
Fixing broken pipe error in uWSGI with Python
https://github.com/unbit/uwsgi/issues/1623
I created a Gist for rate limiting the amount of events that are being send to Sentry:
https://gist.github.com/jurrian/e22f8e724b8499a29c5537e956f0dc7f
It uses ratelimitingfilter which can be configured to set a rate per minute, and additionally add a burst to start rate limiting after a number of events.
I get the same errors, but i never had any problems with my quota. But if you really want to filter it, you can just do it in your sdk:
https://docs.sentry.io/error-reporting/configuration/filtering/?platform=python
But beware, this could hide other errors as mentioned here:
https://github.com/pypa/warehouse/issues/679
To safe yourself some quota, you have two options:
Avoid forwarding events client side, thus preventing events being send to sentry at all. Have a look at the docs for available client-side filters. The drawback with this approach is of course that you need a new code deployment for any adjustment of client-side filters and some clients may not instantly reflect your code changes.
Avoid forwarding events on sentry's side, via inbound filters ([Project] > Project Settings > Inbound Filters). According to the sentry documentation on quota usage, events filtered via inbound filters are not affecting your quota.
Inbound filters include:
Common browser extension errors
Events coming from localhost
Known legacy browsers errors
Known web crawlers
By their error message
From specific release versions of your code
From certain IP addresses
Business plans and above also allow to filter events by error messages.

AWS lambda execution fails only first time I run it with 'customer function error'

I trigger a lambda function via API gateway and everything works perfectly with the one exception that the very first time I trigger it on a given day it fails.
Strangely, the lambda function logs don't show any errors. I get my usual START log statement and then the request and context of the trigger, then after 5s, it ends unexpectedly.
When I look into the API gateway logs this is the error it returns:
Lambda execution failed with status 200 due to customer function error: 2018-12-10T11:00:31.208Z cc233168-fc9n-11fc-a05a-577bb4sd2b2ccc Task timed out after 5.01 seconds.
Has anyone encountered a similar problem? What is customer function error and how may I resolve this?
without knowing much of the background code you are using, i would termed this a Cold Start. Cold start happens for the first request where your function has not be called for a very long time. If you notice error message says "Time Out after 5.01 seconds. which is default set. you can increase a time out.
Alternatively, you could consider reducing the impact of cold starts by reducing the length of cold starts reference :
by authoring your Lambda functions in a language that doesn’t incur a high cold start time — i.e. Node.js, Python, or Go
choose a higher memory setting for functions on the critical path of handling user requests (i.e. anything that the user would have to wait for a response from, including intermediate APIs)
optimizing your function’s dependencies, and package size
You can also explore by putting a cron job through Cloud Watch after every specific interval to call your API through PING
Adding to Yash's answer:
I've only seen Lambda execution failed with status 200 in API Gateway execution logs, though in case it can manifest in other ways: ensure you have execution logging enabled for the endpoint. If you didn't already have it enabled you'll need to wait for the problem to manifest again.
You can verify it's a cold start problem as follows:
In the log entry with the error grab the #logStream value and the timestamp for the event; it'll be a long string of alphanumerics like a4f8115980dc83a511eeedc493a78741
Open the log group for that endpoint's execution log -> find the log stream with the identifier you just grabbed
Narrow the date/time range to a window around the time where the event occurred
If you chose a narrow window and if it's a cold start problem: I would expect the offending request to be the first one in the list. Click the There are older events to load. Load more. at the top of the list.
You should now see a gap of time between the last request received and the offending request.
In my case the error says connection reset by peer which leads me to think it's behaving as though a virtual machine were put to sleep then awoken in the sense that it believes TCP connections it previously had open are still valid.
In the short term the solution we're going with is to implement a retry strategy.
Besides the cold-start problem, there's another potential aspect of this problem: your API Gateway access log format.
Do the following:
Find the access log entries that correspond to the offending request in the execution log.
Is the HTTP status == 502?
502s in the API Gateway access log usually (always?) indicate the Lambda responded with malformed JSON.
The most obvious reason for it returning malformed JSON is a bug in your code. One of the less obvious reasons: a mistake in the access log format.
If you suspect that's the case, look for the following:
Quoted fields that shouldn't be; eg $context.error.messageString
Un-quoted fields that should be. A common idiom is to leave numeric fields un-quoted because it makes insights queries like this work: | filter #status >= 500. As convenient as that is, if the field isn't guaranteed to produce a numeric result then the JSON response will be malformed.
Trailing commas in {} bodies
Here's the documentation for many of the the context variables, though one thing to keep in mind: the context variables that are available differ between the different API Gateway endpoint types (lambda, websocket, etc).

Throttling while registering activities in Simple Work Flow

We have started to experience failures when our processes start up during the registration of activities. The problem is happening in GenericActivityWorker.registerActivityTypes.
The exception generate is:
Caused by: AmazonServiceException: Status Code: 400, AWS Service: AmazonSimpleWorkflow, AWS Request ID: 78726c24-47ee-11e3-8b49-534d57dc0b7f, AWS Error Code: ThrottlingException, AWS Error Message: Rate exceeded
at com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:686)
at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:350)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:202)
at com.amazonaws.services.simpleworkflow.AmazonSimpleWorkflowClient.invoke(AmazonSimpleWorkflowClient.java:3061)
at com.amazonaws.services.simpleworkflow.AmazonSimpleWorkflowClient.registerActivityType(AmazonSimpleWorkflowClient.java:2231)
at com.amazonaws.services.simpleworkflow.flow.worker.GenericActivityWorker.registerActivityType(GenericActivityWorker.java:153)
at com.amazonaws.services.simpleworkflow.flow.worker.GenericActivityWorker.registerActivityTypes(GenericActivityWorker.java:118)
at com.amazonaws.services.simpleworkflow.flow.worker.GenericActivityWorker.registerTypesToPoll(GenericActivityWorker.java:105)
at com.amazonaws.services.simpleworkflow.flow.worker.GenericWorker.start(GenericWorker.java:367)
at com.amazonaws.services.simpleworkflow.flow.ActivityWorker.start(ActivityWorker.java:248)
at com.fluid.retail.workflows.DefaultWorkflowHost.start(DefaultWorkflowHost.java:226)
... 5 more
The ActivityWorker in question has 5 activity implementation classes associated with it, and I think that this throttling is occurring because the internal Flow Framework code is looping over the activity types to register them without any delay in between them.
Because this code is internal to the framework, we can't add any sleep() calls to prevent being throttled.
Any ideas would be appreciated.
Are you sure this is happening during registering your ur activities? Or it is happening during scheduling your activities?
You would get this issue if you try to run a workflow that will schedule too many activities too fast. At this point you have 2 options.
1. Try and make the activites sequential and make them wait on the previous one.
2. Contact AWS to increase your accounts rate.