The mapping of my Elastic search looks like below:
"settings": {
"index": {
"number_of_shards": "5",
"number_of_replicas": "1"
"mappings": {
"node": {
"properties": {
"field1": {
"type": "keyword"
"field2": {
"type": "keyword"
"query": {
"properties": {
"regexp": {
"properties": {
"field1": {
"type": "keyword"
"field2": {
"type": "keyword"
Problem is :
I am forming ES queries using elasticsearch_dsl Q(). It works perfectly fine in most of the cases when my query contains any complex regexp. But it totally fails if it contains regexp character '!' in it. It doesn't give any result when the search term contains '!' in it.
For eg:
1.) Q('regexp', field1 = "^[a-z]{3}.b.*") (works perfectly)
2.) Q('regexp', field1 = "^f04.*") (works perfectly)
3.)Q('regexp', field1 = "f00.*") (works perfectly)
4.) Q('regexp', field1 = "f04baz?") (works perfectly)
Fails in below case:
5.) Q('regexp', field1 = "f04((?!z).)*") (Fails with no results at all)
I tried adding "analyzer":"keyword" along with "type":"keyword" as above in the fields, but in that case nothing works.
In the browser i tried to check how analyzer:keyword will work on the input on the case it fails:
Seems to look fine here with result:
"tokens": [
"token": "f04((?!z).)*",
"start_offset": 0,
"end_offset": 12,
"type": "word",
"position": 0
I'm running my queries like below:
search_obj = Search(using = _conn, index = _index, doc_type = _type).query(Q('regexp', field1 = "f04baz?"))
count = search_obj.count()
response = search_obj[0:count].execute()
logger.debug("total nodes(hits):" + " " + str(
PLease help, its really a annoying problem as all the regex characters work fine in all the queries except !.
Also, how do i check what analyzer is currently applied with above setting in my mappings?

ElasticSearch Lucene regex engine does not support any type of lookarounds. The ES regex documentation is rather ambiguous saying matching everything like .* is very slow as well as using lookaround regular expressions (which is not only ambiguous, but also wrong since lookarounds, when used wisely, may greatly speed up regex matching).
Since you want to match any string that contains f04 and does not contain z, you may actually use
[^z]* - any 0+ chars other than z
fo4 - fo4 substring
[^z]* - any 0+ chars other than z.
In case you have a multicharacter string to "exclude" (say, z4 rather than z), you may use your approach using a complement operator:
This means almost the same but does not support line breaks:
.* - any chars other than newline, as many as possible
f04 - f04
.* - any chars other than newline, as many as possible
& - AND
~(.*z4.*) - any string other than the one having z4


How to use match as with Regular Expression in Mongodb with in Aggregate switch case?

Here what i did.
Inside $AddFields
$switch: {
branches: [
{ case: {
/In Progress/i
then:'In Progress'
{ case: {
{ case: {$eq:['$CaseClientStatus','Complete - All Results Clear']}, then:'Complete'},
{ case: {$eq:['$CaseClientStatus','Case on Hold']}, then:'Case on Hold'}
default: 'Other'
but with this my ClientStatus is showing only Complete,Other,Case On Hold not the one with specified with regex. alghough field contains those words.
here is the one of the doc
"CandidateName": "Bruce Consumer",
"_id": "61b30daeaa237672bb7a17cc",
"CaseClientStatus": "Background Check Case In Progress",
"TAT": "N/A",
"CaseCloseDate": null,
"FormationAutomationStatus": "Automated",
"MethodOfDataSupply": "Automated",
"Status": "Background Case In Progress",
"CreatedDate": "2021-12-10T08:19:58.389Z",
"OrderId": "Ord3954",
"PONumber": "",
"Position": "",
"FacilityCode": "",
"IsCaseClose": false,
"Requester": "Shah Shah",
"ReportErrorList": 0
Assuming you are on version 4.2 or higher (and you should be because 4.2 came out almost 2 years ago) then the $regexFind function gives you what you need. Prior to 4.2, regex was only available in a $match operator, not in complex agg expressions. Your attempt above is admirable but the // regex syntax is not doing what you think it should be doing. Notably, {regex:/Cancelled/i} is simply creating a new object with key regex and string value /Cancelled/i (including the slashes) which clearly will not equal anything in $CaseClientStatus. Here is a solution:
$switch: {
branches: [
{ case: {
$ne: [null, {$regexFind: {input: "$CaseClientStatus", regex: /In Progress/i}}]
}, then:'In Progress'},
{ case: {
$ne: [null, {$regexFind: {input: "$CaseClientStatus", regex: /Cancelled/i}}]
{ case: {$eq:['$CaseClientStatus','Complete - All Results Clear']}, then:'Complete'},
{ case: {$eq:['$CaseClientStatus','Case on Hold']}, then:'Case on Hold'}
default: 'Other'
It looks like you are trying to take a somewhat free-form status "description" field and create a strong enumerated status from it. I would recommend that your $ClientStatus output be more code-ish e.g. IN_PROGRESS, COMPLETE, CXL etc. Eliminate case and certainly whitespace.

Match decimal as string to 0 in MongoDB without regex

I have a MongoDB collection where decimal numbers are stored as string. I need to find all those items that have one of such fields, quantity, equal to 0. Thus when looking for 0 I am actually looking for the strings:
and so on
I tried to use $toInt as follows
db.MyCollection.find({$toInt(Product.Quantity): 0})
but the query is flagged as wrong, it does not even get executed
After some digging I finally found the solution using regex:
db.MyCollection.find({"Product.Quantity": {$regex: "^0+$|^0+(\.0+)"}})
which indeed works but it I am sure that there is a more straightforward solution, it cannot be so utterly complex. Does anybody have a better solution?
Does this help?
$match: {
$expr: {
$eq: [
"$convert": {
"input": "$key",
"to": "double",
"onError": "$key",
"onNull": "$key"
Just replace key by your field.
On this example, you can see how it is operating under the hood.
Or using find
$expr: {
$eq: [
"$convert": {
"input": "$key",
"to": "double",
"onError": "$key",
"onNull": "$key"
Yes, regex is convenient to search for numbers in string format. You could simplify the regex a bit:
db.MyCollection.find({"Product.Quantity": {$regex: "^0(\.0+)?$"}})
Explanation of regex:
^ ... $ - anchor at the beginning and end
0 - expect a 0
(\.0+)? - followed by optional .0, .00, etc

ElasticSearch regexp query of a path

So far I've used a query that would match paths and get aggregations of those paths:
"query": {
"terms": {
"path.keyword": [
"size": 0,
"aggs": { ...
Since the only difference between the paths is the version number (which keeps changing) I thought about using Regexp query.
In a normal regex I would search for \/api\/v1\.\d\/cc-dashboard\/aggregated.
I know ElasticSearch uses different reserved characters for this and I've tried everything I know, but the search comes back without hits.
Any Thoughts?
I think there are a couple of things to watch out for here. First make sure that path.keyword is actually of the type "keyword" or else you will have problem matching b/c you are actually trying to match against tokens and Elasticsearch will split on /. Second it doesn't look like Elasticsearch supports \d to escape for a digit, but it does allow [0-9]. Third to escape the . I had to use two backslashes \\.
So all together now:
PUT /stackoverflow
"mappings": {
"properties": {
"path.keyword": {
"type": "keyword"
POST /stackoverflow/_doc/1
"path.keyword": "/api/v1.0/cc-dashboard/aggregated"
POST /stackoverflow/_doc/2
"path.keyword": "/api/v1.1/cc-dashboard/aggregated"
POST /stackoverflow/_doc/3
"path.keyword": "/api/not/cc-dashboard/aggregated"
GET /stackoverflow/_search
GET /stackoverflow/_search
"query": {
"regexp": {
"path.keyword": {
"value": "/api/v1\\.[0-9]/cc-dashboard/aggregated"
DELETE /stackoverflow

Unrecognized character escape in elasticsearch

Been trying to do regex search in elasticsearch, with the following query:
"query": {
"constant_score": {
"filter": {
"bool": {
"must": [
"regexp": {
"displayName" : "(^a\w+| a(\w+))"
This regex works fine in but the above query gives :
nested: QueryParsingException[[bm_md_acct_9993342_v1] Failed to parse]; nested: JsonParseException[Unrecognized character escape 'w' (code 119)\n at [Source: UNKNOWN; line: 10, column: 37]]; }
I tried escaping it in different ways but with no success. How do I properly put the escape sequence?
Tried :
"query": {
"constant_score": {
"filter": {
"bool": {
"must": [
"regexp": {
"displayName" : "(^J\\w+| J(\\w+))"
gives empty result even though a record of displayName "Jason Cremer" exists.
Regexp query in elasticsearch is not fully flexible.
For example \w matches any word character in normal regex convention, but in elasticsearch you can not represent \w since \ is a reserved character in elasticsearch.
To make \w valid in elasticsearch, we have to escape using \ which will convert your regex to \\\w. Now this \\\w alters the meaning of your regex.
It will match "\" followed by "w" rather than matching word character.
My suggestion is replace \w in your regex with [a-zA-Z0-9_]. This will work.
And also you can not use ^ for a single character. Remove that in your regex and your query would be
{ "query": { "constant_score": {
"filter": {
"bool": {
"must": [
"regexp": {
"displayName" : "(J[a-zA-Z0-9_]+| J([a-zA-Z0-9_]+))"
} } } }
Acc. to the Elasticsearch regex documentation, its syntax does not support shorthand character classes so common in other regex flavors, so you can't use \w, you can only use character classes (or bracket expressions) like [a-zA-Z] to match letters, or [a-zA-Z0-9_] to match what \w matches in JavaScript.
Next, ^ and $, also common in other flavors, are not supported by ES regex. The whole pattern is anchored by default, thus these are not even necessary.
Now, you want any word having J inside. There are several options:
".*J.*" will match any string that contains J
".*J[a-zA-Z].*" will match any string that contains J and then a letter
"J[a-zA-Z].*|.* J[a-zA-Z].*" will match any string that starts with J and then a letter, and then any characters, or any string that contains a space, J, and any letter after it.

ElasticSearch and Regex queries

I am trying to query for documents that have dates within the body of the "content" field.
curl -XGET 'http://localhost:9200/index/_search' -d '{
"query": {
"regexp": {
"content": "^(0[1-9]|[12][0-9]|3[01])[- /.](0[1-9]|1[012])[- /.]((19|20)\\d\\d)$"
Getting closer maybe?
curl -XGET 'http://localhost:9200/index/_search' -d '{
"filtered": {
"query": {
"match_all": {}
"filter": {
"content" : "^(0[1-9]|[12][0-9]|3[01])[- /.](0[1-9]|1[012])[- /.]((19|20)\\d\\d)$"
My regex seems to have been off. This regex has been validated on The following query still returns nothing from the 175k documents I have.
curl -XPOST 'http://localhost:9200/index/_search?pretty=true' -d '{
"query": {
"content" : "/[0-9]{4}-[0-9]{2}-[0-9]{2}|[0-9]{2}-[0-9]{2}-[0-9]{4}|[0-9]{2}/[0-9]{2}/[0-9]{4}|[0-9]{4}/[0-9]{2}/[0-9]{2}/g"
I am starting to think that my index might not be set up for such a query. What type of field do you have to use to be able to use regular expressions?
mappings: {
doc: {
properties: {
content: {
type: string
}title: {
type: string
}host: {
type: string
}cache: {
type: string
}segment: {
type: string
}query: {
properties: {
match_all: {
type: object
}digest: {
type: string
}boost: {
type: string
}tstamp: {
format: dateOptionalTimetype: date
}url: {
type: string
}fields: {
type: string
}anchor: {
type: string
I want to find any record that has a date and graph the volume of documents by that date. Step 1. is to get this query working. Step 2. will be to pull the dates out and group them by them accordingly. Can someone suggest a way to get the first part working as I know the second part will be really tricky.
You should read Elasticsearch's Regexp Query documentation carefully, you are making some incorrect assumptions about how the regexp query works.
Probably the most important thing to understand here is what the string you are trying to match is. You are trying to match terms, not the entire string. If this is being indexed with StandardAnalyzer, as I would suspect, your dates will be separated into multiple terms:
"01/01/1901" becomes tokens "01", "01" and "1901"
"01 01 1901" becomes tokens "01", "01" and "1901"
"01-01-1901" becomes tokens "01", "01" and "1901"
"01.01.1901" actually will be a single token: "01.01.1901" (Due to decimal handling, see UAX #29)
You can only match a single, whole token with a regexp query.
Elasticsearch (and lucene) don't support full Perl-compatible regex syntax.
In your first couple of examples, you are using anchors, ^ and $. These are not supported. Your regex must match the entire token to get a match anyway, so anchors are not needed.
Shorthand character classes like \d (or \\d) are also not supported. Instead of \\d\\d, use [0-9]{2}.
In your last attempt, you are using /{regex}/g, which is also not supported. Since your regex needs to match the whole string, the global flag wouldn't even make sense in context. Unless you are using a query parser which uses them to denote a regex, your regex should not be wrapped in slashes.
(By the way: How did this one validate on regex101? You have a bunch of unescaped /s. It complains at me when I try it.)
To support this sort of query on such an analyzed field, you'll probably want to look to span queries, and particularly Span Multiterm and Span Near. Perhaps something like:
"span_near" : {
"clauses" : [
{ "span_multi" : {
"match": {
"regexp": {"content": "0[1-9]|[12][0-9]|3[01]"}
{ "span_multi" : {
"match": {
"regexp": {"content": "0[1-9]|1[012]"}
{ "span_multi" : {
"match": {
"regexp": {"content": "(19|20)[0-9]{2}"}
"slop" : 0,
"in_order" : true
For newer elasticsearch versions (tested 8.5).
We can use .keyword in the field. It will match the whole sentence.
"size": 10,
"_source": [
"query": {
"bool": {
"should": [
"regexp": {
"load.keyword": {
"value": ".*Search Term.*",
"flags": "ALL"