Kibana regex not work - regex

I need to search for a range value from my logs, but my regex doesn't work in Kibana.
/(took":[1-9][0-9][0-9][,])/g
Content:
{"real_time":"2016-05-03T10:02:13.360Z","content":{"delay":687,"updated":true,"searchItems":{"monitoring_id":"111354","params":{"pass":["111354"],"named":{"d":"2016-04-29|2016-04-30"},"action":"mentions","plugin":null,"controller":"api11","form":[],"url":{"url":"1.1\/mentions\/111354\/","publickey":"yn68FDuQ","time":"1462303544,8356","signature":"102ade1f6749e89be876fdb00a7b9ade","published_date":"2016-04-29|2016-04-30","ipp":"100","page":"14"},"isAjax":false},"source_ids":"","timestamp":"","pagination":"1300, 100","trackerId":"","onlyIds":[],"exceptIds":[],"timezone":"Brazil\/East"},"search":[{"index":"mentions_ro","type":"mention","from":1300,"size":100,"body":{"query":{"bool":{"must":[{"term":{"monitoring.id":"111354"}},{"range":{"published_at":{"gte":"1969-12-31T21:00:00-03:00","lte":"1969-12-31T21:00:00-03:00"}}}]}},"sort":{"published_at":{"order":"desc"}}},"fields":[]}],"response":{"took":500,"timed_out":false,"_shards":{"total":21,"successful":21,"failed":0},"hits":{"total":0,"max_score":null,"hits":[]}}}}
My regex is working here, however:
https://regex101.com/r/pV4mR7/1
Obs:
I already tried to escape some characters
If I look the request sent to Elastic, Kibana uses a query string:
Any tips?

According to their documentation these characters are always metacharacters, and must be escaped if you want them as literals:
. ? + * | { } [ ] ( ) " \
These characters are metacharacters under certain modes:
# & < > ~ #
You don't need to put the comma in a char class.
It looks like you might not be able to just throw the regex in the search box.
Kibana only matches regexp over the _all field:
Try to "inspect" one of the elements in your page, you will see that _all field is hardcoded :
"global": true,
"facet_filter": {
"fquery": {
"query": {
"filtered": {
"query": {
"regexp": {
"_all": {
"value": "category: /pattern/"
> https://github.com/elastic/kibana/issues/631
Try this:
(took\":[1-9][0-9][0-9],)
I'm not familiar with Elasticsearch or Kibana, but your query may end up looking like this:
"regexp": {
"_all": {
"value": "category: /(took\":[1-9][0-9][0-9],)/"
}
}

Related

Not able to get desired search results in ElasticSearch search api

I have field "xyz" on which i want to search. The type of the field is keyword. The different values of the field "xyz "are -
a/b/c/d
a/b/c/e
a/b/f/g
a/b/f/h
Now for the following query -
{
"query": {
"query_string" : {
"query" : "(xyz:(\"a/b/c\"*))"
}
}
}
I should only get these two results -
a/b/c/d
a/b/c/e
but i get all the four results -
a/b/c/d
a/b/c/e
a/b/f/g
a/b/f/h
Edit -
Actually i am not directly querying on ElasticSearch, I am using this API https://atlas.apache.org/api/v2/resource_DiscoveryREST.html#resource_DiscoveryREST_searchWithParameters_POST which creates the above mentioned query for elasticsearch, so i dont have much control over the elasticsearch query_string. What i can change is the elasticsearch analyzer for this field or it's type.
You'll need to let the query_string parser know you'll be using regex so wrap the whole thing in /.../ and escape the forward slashes:
{
"query": {
"query_string": {
"query": "xyz:/(a\\/b\\/c\\/.*)/"
}
}
}
Or, you might as well use a regexp query:
{
"query": {
"regexp": {
"xyz": "a/b/c/.*"
}
}
}

How do I define an AWS MetricFilter FilterPattern to match a JSON-formatted log event in CloudWatch?

I am trying to define a metric filter, in an AWS CloudFormation template, to match JSON-formatted log events from CloudWatch.
Here is an example of the log event:
{
"httpMethod": "GET",
"resourcePath": "/deployment",
"status": "403",
"protocol": "HTTP/1.1",
"responseLength": "42"
}
Here is my current attempt to create a MetricFilter to match the status field using the examples given from the documentation here: FilterAndPatternSyntax
"DeploymentApiGatewayMetricFilter": {
"Type": "AWS::Logs::MetricFilter",
"Properties": {
"LogGroupName": "/aws/apigateway/DeploymentApiGatewayLogGroup",
"FilterPattern": "{ $.status = \"403\" }",
"MetricTransformations": [
{
"MetricValue": "1",
"MetricNamespace": "ApiGateway",
"DefaultValue": 0,
"MetricName": "DeploymentApiGatewayUnauthorized"
}
]
}
}
I get a "Invalid metric filter pattern" message in CloudFormation.
Other variations I've tried that didn't work:
"{ $.status = 403 }" <- no escaped characters
{ $.status = 403 } <- using a json object instead of string
I've been able to successfully filter for space-delimited log events using the bracket notation defined in a similar manner but the json-formatted log events don't follow the same convention.
Ran into the same problem and was able to figure it out by writing a few lines with the aws-cdk to generate the filter pattern template to see the difference between that and what I had.
Seems like it needs each piece of criteria wrapped in parenthesis.
- FilterPattern: '{ $.priority = "ERROR" && $.message != "*SomeMessagePattern*" }'
+ FilterPattern: '{ ($.priority = "ERROR") && ($.message != "*SomeMessagePattern*") }'
It is unfortunate that the AWS docs for MetricFilter in CloudFormation have no examples of JSON patterns.
I kept running into this error too, because I had the metric filter formatted with double quotes on the outside like this.
FilterPattern: "{ ($.errorCode = '*UnauthorizedOperation') || ($.errorCode = 'AccessDenied*') }"
The docs say:
Strings that consist entirely of alphanumeric characters do not need to be quoted. Strings that have unicode and other characters such as ‘#,‘ ‘$,' ‘,' etc. must be enclosed in double quotes to be valid.
It didn't explicitly list the splat/wildcard * character, so I thought it would be OK inside single quotes, but it kept saying the metric filter pattern was bad because of the * in single quotes
I could have used single quotes around the outside of the pattern and double quotes around the strings inside, but I opted for escaping the double quotes like this instead.
FilterPattern: "{ ($.errorCode = \"*UnauthorizedOperation\") || ($.errorCode = \"AccessDenied*\") }"

Elasticsearch matching words

I have a problem. I want my query to match mysql with mysql!, mysql(, mysql ... in other words I want my query result to show only these values which have just word "mysql" OR word "mysql" and special characters around this word.
For example query should get - mysql, mysql, #mysql, mysql$. But not "mysql and R" or "mysql mysql".
I tried this query for matching, but I keep getting query malformed, no field after start_object, so not sure what to do.
Also wanted to ask maybe there are other option to perform the same task ?
POST skills/_search
{
"query": {
"filtered": {
"query": {
"match": {
"skillname": "mysql"
}
},
"filter": {
"type":"pattern_capture",
"patterns":["mysql([##]\\w+)"],
"preserve_original": true
}
}
}
}

Elasticsearch: How can I filter & group by specific URL paths?

I've got an index, urls, which looks like this:
path: {
type: "string"
},
#timestamp: {
type: "date",
format: "strict_date_optional_time||epoch_millis"
},
The path will store the PATH section from a url, e.g:
https://facebook.com/profile/photos/album/1
Would be stored as:
/profile/photos/album/1
I'm storing all sorts of paths, so there could be more like:
/profile/photos/album/1
/profile/photos/album/2
/profile/photos/album/2
/profile/photos/album/2
/profile/friends/1
/profile/friends/2
/newsfeed/me/
/newsfeed/me/
/newsfeed/friendName/
I'm trying to find out the number of unique pageviews each of the paths have. I'm unsure how I should do this, should I use a regexp?
I'd imagine it'd look something like (pseudo code):
{
"query": {
"regexp": {
"path": ""
},
"unique": true
}
}
So I found out how to do this. I'm using the aggs method & using a regex to exclude results!
{
"size": 0, // Don't return any _source results
"aggs": {
"path": { // This is the field that I'm
"terms": {
"field": "path",
"exclude": ".*(media|cache).*" // Add in the values here seper
}
}
}
}
Breakdown:
path
Just the label of aggregation
field (path)
The field which I want to run the following regex on
exclude
Don't return documents where path has media or cache in it
I found this out from Elasticsearch: Run aggregation on field & filter out specific values using a regexp not matching values

Elasticsearch - behavior of regexp query against a non-analyzed field

What is the default behavior for a regexp query against a non-analyzed field? Also, is that the same answer when dealing with .raw fields?
After everything i've read, i understand the following.
1. RegExp queries will work on analyzed and non-analyzed fields.
2. A regexp query should work across the entire phrase rather than just matching on a single token in non-analyzed fields.
Here's the problem though. I can not actually get this to work. I've tried it across multiple fields.
The setup i'm working with is a stock elk install and i'm dumping pfsense and snort logs into it with a basic parser. I'm currently on Kibana 4.3 and ES 2.1
I did a query to look at the mapping for one of the fields and it indicates it is not_analyzed, yet the regex does not work across the entire field.
"description": {
"type": "string",
"norms": {
"enabled": false
},
"fields": {
"raw": {
"type": "string",
"index": "not_analyzed",
"ignore_above": 256
}
}
}
What am i missing here?
if a field is non-analyzed, the field is only a single token.
It's same answer when dealing with .raw fields, at least in my work.
You can use groovy script:
matcher = (doc[fields.raw].value =~ /${pattern}/ );
if(matcher.matches()) {
matcher.group(matchname)}
you can pass pattern and matchname in params.
What's meaning of tried it across multiple fields.? If your situation is more complex, maybe you could make a native java plugin.
UPDATE
{
"script_fields" : {
"regexp_field" : {
"script" : "matcher = (doc[fieldname].value =~ /${pattern}/ );if(matcher.matches()) {matcher.group(matchname)}",
"params" : {
"pattern" : "your pattern",
"matchname" : "your match",
"fieldname" : "fields.raw"
}
}
}
}