cloudant nosql db query with regex - regex

I have a cloudant nosql db with some records like this:
{
"role": "utente",
"nome": "Rocco",
"cognome": "Di Vitto",
"team": "sap pm-cs",
"company": "wrestling",
"appManager": "john Ford",
"teamLeader": "Rendy Orton, Colombo M.R."
}
I would like to query my db with a $regex the teamLeader field, passing a single string that matches either with "Randy Orton" or "Colombo M.R." but I can't figure out the regex for matching two comma-delimited strings. I've tried reading Erlang queries structure but I didn't find the solution. Some help would be very useful. thanks in advance

when supplying regular expressions you should be using the $regex operator. have you tried something like this:
{
"selector": {
"teamLeader": {
"$regex": "(Colombo M.R.)|(Rendy Orton)"
}
}
}

Related

MongoDB - Use field in the document as regex expression

I am storing regex expressions in MongoDB and would like to use them for queries.
Documents structure:
{
"name": "foo",
"regex_expression": "^[^# ]+#(bar\\.com)$"
}
I tried the following query but it doesn't work.
db.collection.find({$expr: {$regex: ["foo#bar.com", "$regex_expression]}})
Is it possible to do this kind of query?
You need the $regexMatch operator.
db.collection.find({
$expr: {
$regexMatch: {
input: "foo#bar.com",
regex: "$regex_expression"
}
}
})
Demo # Mongo Playground

Find string in between in kibana elastic search with regex like in splunk

In splunk, we can filter out dynamic string in between two strings.
Say for example,
<TextileType>Shirt</TextileType>
<TextileType>Trousers</TextileType>
<TextileType>Shirt</TextileType>
<TextileType>Trousers</TextileType>
<TextileType>Shirt</TextileType>
The output I am expecting:
Shirt - 3
Trousers - 2
I am able to do this in splunk, easily.
Picture copied from Google (not exact one)
How can I achieve this in Kibana ?
Tried many ways, but not able to do any regex as per my need.
Note: Here's the example json query, in which I need to add regex. In this example, I am just trying to search for "Shirt" manually, which I am expecting to get dynamically.
{
"query": {
"match": {
"text": {
"query": "Shirt",
"type": "phrase"
}
}
}
}
Considering data is in the sample index, you can use a wildcard search:
GET /sample/_search
{
"query": {
"wildcard":{
"column2":"*Shirt*"
}
}
}
Notice how it only returns results containing keyword Shirt
If you are looking to clean the data, you'd need to run it through a logstash pipeline to strip the XML tags and leave you with the text.

How to write a regexp in elastisearch so that it gives me URLs with numbers?

I am trying to write a query in Kibana which works with Elastisearch Query DSL. The basic filter is as follows:
{
"query": {
"match": {
"path": {
"query": "/abc/",
"type": "phrase"
}
}
}
}
Now I need to write a query so that it gives me "path" which is of the form /abc/(0-9)/.
I tried the reference provided here but it does not make sense to me (I am not well versed with Elasticsearch):
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-regexp-query.html
I would like to filter out results which are of the form path = /abc/12345/
This RegEx might help you to do so:
\x22query\x22:\s\x22(\/.*)\x22
It creates a target capturing group, where your desired output is and you might be able to call it using $1.
You may add additional boundaries to your pattern, if you wish, such as this RegEx:
\x22query\x22:\s\x22([\/a-z0-9]+)\x22

Elasticsearch aggregation to extract pattern and occurrences

I have trouble formulating what I'm looking for so I'll use an example:
You put 3 documents in elasticsearch all with a field "name" containing these values: "test", "superTest51", "stvv".
Is it possible to extract a regular expression like pattern with the occurrences? In this case:
"xxxx": 2 occurrences
"x{5}Xxxx99": 1 occurrence
I've read some things about analyzers, but I don't think that's what I'm looking for.
Edit: To make the question clearer: I don't want to search for a regex pattern, I want to do an aggregate on a regular expression replaced field. For example replace [a-z] with x. Is the best way really to do the regular expression replace outside of elasticsearch?
Based on the formulation of your request, not sure this will match what you are looking for, but assuming you mean to search based on regex ,
following should be what you are looking for:
wildcard and regexp queries
Do take note that the behavior will be different whether the field targeted is analyzed or not.
Typically if you went with the vanilla setup of Elasticsearch as most people to start, your field will likely be analyzed, you can check your the events mapping in your indices to confirm that.
Based on your example and assuming you have a not_analyzed name field:
GET _search
{
"query": {
"regexp": {
"name": "[a-z]{4}"
}
}
}
GET _search
{
"query": {
"regexp": {
"name": "[a-z]{5}[A-Z][a-z]{3}[0-9]{2}"
}
}
}
Based on your update, and a quick search (am not that familiar with aggregations), could be something like the following would match your expectations:
GET _search
{
"size": 0,
"aggs": {
"regmatch": {
"filters": {
"filters": {
"xxxx": {
"regexp": {
"name": "[a-z]{4}"
}
},
"x{5}Xxxx99": {
"regexp": {
"name": "[a-z]{5}[A-Z][a-z]{3}[0-9]{2}"
}
}
}
}
}
}
}
This will give you 3 counts:
- total number of events
- number of first regex match
- number of second regex match

elasticsearch - search with regex involving space

I want to perform searching using regular expression involving whitespace in elasticsearch.
I have already set my field to not_analyzed. And it's mapping is just like
"type1": {
"properties": {
"field1": {
"type": "string",
"index": "not_analyzed",
"store": true
}
}
}
And I input two value for test,
"field1":"XXX YYY ZZZ"
"field1":"XXX ZZZ YYY"
And i do some case using regex query /XXX YYY/ (I want to use this query to find record1 but not record2)
{
"query": {
"query_string": {
"query": "/XXX YYY/"
}
}
}
But it return 0 results.
However if I search without using regex (without the forward slash '/'), both record1 and record2 are returned.
Is that in elasticsearch, i cannot search using regex query involving space?
What you need is a ''term'' query that doesn't tokenise the search query by breaking it down into smaller parts. More about the term query here: https://www.elastic.co/guide/en/elasticsearch/reference/2.0/query-dsl-term-query.html
There's a special breed of term queries that allows you to use regexes called regexp queries. That should match any whitespaces as well: https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-regexp-query.html
You can keep using your query string, but your regexp is just missing a tiny part, i.e. the .* at the end. If you run that you'll get the single result you expect.
{
"query": {
"query_string": {
"query": "/XXX YYY.*/"
}
}
}
You can use regexp queries to achieve this. Mind you, the query performance may be slow. The below query will search for all documents in which the value of field1 contains "XXX YYY".
POST <index_name>/type1/_search
{
"query": {
"regexp": {
"field1": ".*XXX YYY.*"
}
}
}