How to match a pattern of "a=b c=d" with changing order in grok (logstash)?

How to match a pattern of "a=b c=d" with changing order in grok (logstash)? - regex

I'm using Logstash to match Fortinet analyzer logs, and the problem is there are so many pattern without order of the fields.
e.g. one type of message would be:
service=DNS hostname="a.b.net" profile="Dns" action=blocked reqtype=direct url="/" sentbyte=0 rcvdbyte=0 direction=N/A msg="URL belongs to a denied category in policy" method=domain cat=61 catdesc="Phishing" crscore=60 crlevel=high
...and another is:
msg="File is infected." action=blocked service=HTTP sessionid=33137 direction=incoming filename="favicon.ico" quarskip=No-skip virus="MSWord/Agent.DD60!tr" dtype="Virus" ref="http://www.fortinet.com/ve?vn=MSWord%2FAgent.DD60%21tr" virusid=6920465 profile="AV"
As you can see both have msg, action, service and profile but with different order.
Is there anyway to build a pattern to match something like:
(.*?)=%{DATA:\1?}\s
...while giving the field the name of the first match?

Use the kv{} filter which can split it all apart and doesn't care about the order.

Related

Fail2Ban regex for Drupal log not matching

I am trying to match and ban certain patterns in my drupal logs (drupal 9).
I have taken the base drupal-auth regex, created a new conf and tried to amend it to my requirements but I seem to be failing at the first hurdle. This is the code that will give me anything that has the type 'user' and this is filtered by the user\ in the code below, just before the <HOST> block:
failregex = ^%(__prefix_line)s(https?:\/\/)([\da-z\.-]+)\.([a-z\.]{2,6})(\/[\w\.-]+)*\|\d{10}\|user\|<HOST>\|.+\|.+\|\d\|.*\|.+\.$
If I want to search exactly the same pattern, but with say 'page not found' or 'access denied' instead of 'user' what do I need? I cannot seem to get it to match the moment the type has a space in it. It seems such a simple thing to do!
I am using fail2ban-regex --print-all-matched to test.

RegEx for filtering in Azure using Terraform

The Terraform azurerm_image data source lets you use a RegEx to identify a machine image whose ID matches the regular expression.
What RegEx should be used to retrieve an image that includes the string MyImageName and that takes the complete form /subscriptions/abc-123-def-456-ghi-789-jkl/resourceGroups/MyResourceGroupName/providers/Microsoft.Compute/images/MyImageName1618954096 ?
The following version of the RegEx is throwing an error because it will not accept two * characters. However, when we only used the trailing *, the image was not retrieved.
data "azurerm_image" "search" {
name_regex = "*MyImageName*"
resource_group_name = var.resourceGroupName
}
Note that the results only return a single image so you do not need to worry about multiple images being returned. There is a flag that can be set to specify either ascending or descending sorting to retrieve the oldest or the newest match.
The precise error we are getting is:
Error: "name_regex": error parsing regexp: missing argument to repetition operator: `*`
Nick's Suggestion
Per #Nick's suggestion, we tried:
data "azurerm_image" "search" {
name_regex = "/MyImageName[^/]+$"
resource_group_name =
var.resourceGroupName
}
But the result is:
Error: No Images were found for Resource Group "MyResourceGroupName"
We checked in the Azure Portal and there is an image that includes MyImageName in its name within the resource group named MyResourceGroupName. We also confirmed that Terraform is running as the subscription owner, so we imagine that the subscription owner has sufficient authorization to filter image names.
What else can we try?

After my validation, it seems that it works when name_regex includes only one trailing *. If with one prefix *, it will generate that error message.
For example, I have an image name rrr-image-20210421150018 in my resource group.
The following works:
r*
-*
8*
rrr*
image*
2021*
The following does not work:
*r
*-
*8
*image*
*rrr*
Also, verify if you have the latest azurerm provider.
Result

URL not resolving correctly in Django URL

I just tried the following in my AJAX update:
[Server]/secTypes/Update
This maps to the following url in URLS.py:
url(r'^secTypes/Update/', equity.views.updateSecTypes, name='updateSecTypes'),
This doesn't resolve to the following function in my view.
But when I change the URL expression to:
url(r'^su/', equity.views.updateSecTypes, name='updateSecTypes')
It works fine.
What in the URL resolver is not getting accurately mapped? Is it the forward slash?
I think it has to do with something related to the regex so if someone understands this better can help me that would be appreciated.

From the url patterns in your comments, it looks like you had another matching pattern before the one in your question.
There are two simple solution for this.
Move that first pattern down. Change this:
url(r'^secTypes/', equity.views.getSecTypes, name='getSecTypes'),
url(r'^secTypesAll/', equity.views.getSecTypesAll, name='getSecTypesAll'),
url(r'^secTypes/Update/', equity.views.updateSecTypes, name='updateSecTypes'),
url(r'^secTypes/Delete/', equity.views.deleteSecTypes, name='deleteSecTypes'),
url(r'^secTypes/Create/', equity.views.createSecTypes, name='createSecTypes'),
to this:
url(r'^secTypesAll/', equity.views.getSecTypesAll, name='getSecTypesAll'),
url(r'^secTypes/Update/', equity.views.updateSecTypes, name='updateSecTypes'),
url(r'^secTypes/Delete/', equity.views.deleteSecTypes, name='deleteSecTypes'),
url(r'^secTypes/Create/', equity.views.createSecTypes, name='createSecTypes'),
url(r'^secTypes/', equity.views.getSecTypes, name='getSecTypes'),
The order matters when resolving URL patterns and if an earlier one matches, the following ones are not processed.
Both r'^secTypes/' and r'^secTypes/Update/' matches the string 'secTypes/Update/' so you need to be careful to put the more specific one first and the more general one afterwards.
Update the regex to match the end of the URL string by adding a $ like this:
url(r'^secTypes/$', equity.views.getSecTypes, name='getSecTypes'),
url(r'^secTypesAll/$', equity.views.getSecTypesAll, name='getSecTypesAll'),
url(r'^secTypes/Update/$', equity.views.updateSecTypes, name='updateSecTypes'),
url(r'^secTypes/Delete/$', equity.views.deleteSecTypes, name='deleteSecTypes'),
url(r'^secTypes/Create/$', equity.views.createSecTypes, name='createSecTypes'),
This is the preferred solution since it would stop Django from matching a URL like secTypes/Update/foobar
However, if you have logic in the view that specifically uses the substring after the end of the URL pattern (i.e. foobar based on the above example), this wouldn't work.

How to configure Fiddler's Autoresponder to "map" a host to a folder?

I'm already using Fiddler to intercept requests for specific remote files while I'm working on them (so I can tweak them locally without touching the published contents).
i.e. I use many rules like this
match: regex:(?insx).+/some_file([?a-z0-9-=&]+\.)*
respond: c:\somepath\some_file
This works perfectly.
What I'd like to do now is taking this a step further, with something like this
match: regex:http://some_dummy_domain/(anything)?(anything)
respond: c:\somepath\(anything)?(anything)
or, in plain text,
Intercept any http request to 'some_dummy_domain', go inside 'c:\somepath' and grab the file with the same path and name that was requested originally. Query string should pass through.
Some scenarios to further clarify:
http://some_domain/somefile --> c:\somepath\somefile
http://some_domain/path1/somefile --> c:\somepath\path1\somefile
http://some_domain/path1/somefile?querystring --> c:\somepath\path1\somefile?querystring
I tried to leverage what I already had:
match: regex:(?insx).+//some_dummy_domain/([?a-z0-9-=&]+\.)*
respond: ...
Basically, I'm looking for //some_dummy_domain/ in requests. This seems to match correctly when testing, but I'm missing how to respond.
Can Fiddler use matches in responses, and how could I set this up properly ?
I tried to respond c:\somepath\$1 but Fiddler seems to treat it verbatim:
match: regex:(?insx).+//some_domain/([?a-z0-9-=&]+\.)*
respond: c:\somepath\$1
request: http://some_domain/index.html
response: c:\somepath\$1html <-----------

The problem is your use of insx at the front of your expression; the n means that you want to require explicitly-named capture groups, meaning that a group $1 isn't automatically created. You can either omit the n or explicitly name the capture group.
From the Fiddler Book:
Use RegEx Replacements in Action Text
Fiddler’s AutoResponder permits you to use regular expression group replacements to map text from the Match Condition into the Action Text. For instance, the rule:
Match Text: REGEX:.+/assets/(.*)
Action Text: http://example.com/mockup/$1
...maps a request for http://example.com/assets/Test1.gif to http://example.com/mockup/Test1.gif.
The following rule:
Match Text: REGEX:.+example\.com.*
Action Text: http://proxy.webdbg.com/p.cgi?url=$0
...rewrites the inbound URL so that all URLs containing example.com are passed as a URL parameter to a page on proxy.webdbg.com.
Match Text: REGEX:(?insx).+/assets/(?'fname'[^?]*).*
Action Text C:\src\${fname}
...maps a request for http://example.com/‌assets/img/1.png?bunnies to C:\src\‌img\‌1.png.

disallow some domain names, allow others

For example there are URLs http://www.subdomain1.domain.com.uk and http://www.subdomain2.domain.uk, from these URLs I need to extract only the name subdomain1 or subdomain2.
But if I receive http://www.subdomain3.co.uk or http://www.subdomain4.com I need to get the whole URL like subdomain3.co.uk or subdomain4.com.
My expression: ^http:\/\/(?:www\.)?((?!SubdomainToNegate|www).*)((?!\.domain\.com\.uk|\.domain\.uk).*)$
Expression catches whole URL.
My situation is shown better over there: http://www.rubular.com/r/B1iOUoUq33

^http:\/\/(?:www\.)?((?!SubdomainToNegate|www)[^\.]*)((?!\.domain\.com\.uk|\.domain\.uk).*)$
difference is this
(?!SubdomainToNegate|www)[^\.]*
instead of this
(?!SubdomainToNegate|www)\.*

finally found solution:
http:\/\/(?:www\.)?((?:(?!domain.com.uk|domain.uk)[^\s.]+)(?:\.(?!domain.com.uk|domain.uk)[^\s.]+)*)

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

How to match a pattern of "a=b c=d" with changing order in grok (logstash)? - regex

Use the kv{} filter which can split it all apart and doesn't care about the order.

Related

Fail2Ban regex for Drupal log not matching

RegEx for filtering in Azure using Terraform

URL not resolving correctly in Django URL

How to configure Fiddler's Autoresponder to "map" a host to a folder?

disallow some domain names, allow others

Categories

Resources