I've a problem to build the correct query. I have an index with a field "ids" with the following mapping:
"ids" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
A sample content could look like this:
10,20,30
It's a list of ids. Now I want to make a query with multiple possible ids and I want to make a disjunction (OR) so I decided to use a regexp:
{
"query" : {
"bool" : {
"must" : [
{
"query_string" : {
"query" : "Test"
}
},
{
"regexp" : {
"ids" : {
"value" : "10031|20|10038",
"boost" : 1
}
}
}
]
}
},
"size" : 10,
"from" : 0
}
The query is executed successfully but with no results. I expected to find 3 results.
If you want to get 10031 or 20 or 10038, you need to add parenthesis.
Change "10031|20|10038" => "(10031|20|10038)"
Related
I want to filter the decimal value in child array of json file.In below sample code i want to apply the like function to get the json value like(t1,t2) in below sample file.
Sample code:
db.getCollection('temp').find({},{"temp.text./.*t.*/.value":1})
Sample Json file:
{
"_id" :0"),
"temp" : {
"text" : {
"t1" : {
"value" : "960"
},
"t2" : {
"value" : "959"
},
"t3" : {
"value" : "961"
},
"t4" : {
"value" : "962"
},
"t5" : {
"value" : "6.0"
}
}
}
}
MongoDB doesn't have a way to filter field names directly other than projection, which is exact match only.
However, using aggregation you can use $objectToArray, which would convert the object {"t1" : {"value" : "960"}} to [{"k":"t1","v":{"value":"960"}}]. You can then filter based on the value of k, and use $arrayToObject to convert the entries left back into an object.
.aggregate([
{$addFields:{
"temp.text":{
$arrayToObject:{
$filter:{
input:{$objectToArray:"$temp.text"},
cond:{
$regexMatch:{
input:"$$this.k",
regex:/t/
}
}
}
}
}
}}
])
Playground
This seems like it should be really simple but I haven't found any examples or documentation. I've got a dynamodb table that looks like this:
record 1: {name, email, items[{product}, {item2}, {item3]}
record 2: (name, email, items[{product}, {item2}, {item3]}
I need to be able to update items elements, i.e., update item1 object in record 1. I can do this with the following code by hardcoding the list array element, but I can't figure out how to pass the item number into the update expression :
{
"version" : "2017-02-28",
"operation" : "UpdateItem",
"key" : {
"id" : { "S" : "${context.arguments.input.id}" }
},
"update" : {
"expression" : "SET #items[0].#product= :productVal",
"expressionNames" : {
"#product": "product",
},
"expressionValues" : {
":productVal": { "S" : "${context.arguments.input.product}" },
}
}
Have you tried something like:
"update" : {
"expression" : "SET #items[:idx].#product= :productVal",
"expressionNames" : {
"#product": "product",
},
"expressionValues" : {
":productVal": { "S" : "${context.arguments.input.product}" },
":idx": { "N" : 0 }
}
}
I have the following query:
{
"query" : {
"bool" : {
"must" : [
{
"query_string" : {
"query" : "dog cat",
"analyzer" : "standard",
"default_operator" : "AND",
"fields" : ["title", "content"]
}
},
{
"range" : {
"dateCreate" : {
"gte" : "2018-07-01T00:00:00+0200",
"lte" : "2018-07-31T23:59:59+0200"
}
}
},
{
"regexp" : {
"articleIds" : {
"value" : ".*?(2561|30|540).*?",
"boost" : 1
}
}
}
]
}
}
}
The fields title, content and articleIds are of type text, dateCreate is of type date. The articleIds field contains some IDs (comma-separated).
Ok, what happens now? I execute the query an get two results: Both documents contain the words "dog" and "cat" in the title or in the content. So far it's correct.
But the second result has the number 3507 in the articleIds field which doesn't match to my query. It seems that the regexp is ignored because title and content already match. What is wrong here?
And here's the document that should not match my query but does:
{
"_index" : "example",
"_type" : "doc",
"_id" : "3007780",
"_score" : 21.223656,
"_source" : {
"dateCreate" : "2018-07-13T16:54:00+0200",
"title" : "",
"content" : "Its raining cats and dogs.",
"articleIds" : "3507"
}
}
And what I'm expecting is that this document should not be in the results because it contains 3507 which is not part of my query...
I'm doing an insert from Logstash into ElasticSearch. My problem is that I used a template in ES to lay out the data types, and I am sometimes getting values from Logstash that are null values (or dashes) when I've declared in ES that they should be doubles.
So sometimes, ES is getting a '-' instead of something like "2342", and it is rejecting it and causing an error. Now, if I can replace the '-' with the word 'null', ES works fine.
How do I do this? I assume it works with the ruby filter. I need to be able to replace the '-' fields with null when appropriate.
EDIT:
I was asked for sample configs.
So, for example, say the below config is logstash, which will then send data to ES:
filter {
if [type] == "transaction" {
match => ["message", "%{BASE16FLOAT:ts}\t%{IP:orig_ip}\t%{NOTSPACE:orig_port}" ]
}
}
Now my ES template is saying:
"transaction" : {
"properties" :
{
"ts" : {
"format" : "dateOptionalTime",
"type" : "date"
},
"orig_ip" : {
"type" : "ip"
},
"orig_port" : {
"type" : "long"
},
}
}
So if I throw a data set like either of these, it passes:
{"ts" : "123456789.123234", "orig_ip" : "10.0.0.1", "orig_port" : "2342" }
{"ts" : "123456789.123234", "orig_ip" : "10.0.0.1", "orig_port" : null }
I get a success. But, the following [obviously] fails:
{"ts" : "123456789.123234", "orig_ip" : "10.0.0.1", "orig_port" : "-" }
How can I ensure that the "-" (with quotes) gets changed to a null?
If you amend your template by specifying "ignore_malformed": true in your orig_port long field, it should work.
"transaction" : {
"properties" :
{
"ts" : {
"format" : "dateOptionalTime",
"type" : "date"
},
"orig_ip" : {
"type" : "ip"
},
"orig_port" : {
"type" : "long"
"ignore_malformed": true <---- add this line
}
}
}
I am looking for a way to get distinct "unit" values from a collection that has a structure similar to the following:
{
"_id" : ObjectId("548b1aee6e444414f00d5cf1"),
"KPI" : {
"NPV" : {
"value" : 100,
"unit" : "kUSD"
},
"NPM" : {
"value" : 100,
"unit" : "kUSD"
},
"GPM" : {
"value" : 50,
"unit" : "CAD"
}
}
}
I looked into using wildcards and regex but from what I have come across this is not supported for field matching. I would like to do something like db.collection.distinct('KPI.*.unit') but cannot determine how and it seems like performance would be poor. Does anyone have a recommendation? Thanks.
It's not a good practice to make the keys a part of the content of the document - don't use keys as data. If you don't change your document structure, you'll need to know what the possible subfields of KPI are. If you don't know what those could be, you will need to examine the documents manually to find them. Then you can issue a distinct for each using dot notation, e.g. db.collection.distinct("KPI.NPM.unit").
If what you're looking for instead is the distinct values of unit across all values of the parent KPI subfield, then you could take the union of all of the results of the distincts. You can also do it easily with an aggregation framework in MongoDB 2.6. For simplicity, I'll assume there's just three distinct subfields of KPI, the ones in the document above.
db.collection.aggregate([
{ "$group" : { "_id" : 0, "NPVunits" : { "$addToSet" : "$KPI.NPV.unit" }, "NPMunits" : { "$addToSet" : "$KPI.NPM.unit" }, "GPMunits" : { "$addToSet" : "$KPI.GPM.unit" } }
{ "$project" : { "distinct_units" : { "$setUnion" : ["$NPVunits", "$NPMunits", "$GPMunits"] } } }
])
You could also structure your data as dynamic attributes. The document above would be recast as something like
{
"_id" : ObjectId("548b1aee6e444414f00d5cf1"),
"KPI" : [
{ "type" : "NPV", "value" : 100, "unit" : "kUSD" },
{ "type" : "NPM", "value" : 100, "unit" : "kUSD" },
{ "type" : "GPM", "value" : 50, "unit" : "CAD" }
]
}
Querying for distinct units is easy now, whether you want it per type or over all types:
Per type (all types in one query)
db.collection.aggregate([
{ "$unwind" : "$KPI" },
{ "$group" : { "_id" : "$KPI.type", "units" : { "$addToSet" : "$KPI.unit" } } }
])
Over all types
db.collection.distinct("KPI.unit")