Ordering a term aggregation with a multi-bucket sub-aggregation - amazon-web-services

Given a term aggregation (label), I would like to sort the bucket by a string field (energy).
The problem is that we cannot use a multi-bucket value in the order clause.
For a given label, I'm sure that there is only one energy. What I would like to do is to use the first (and only) result of my energy sub aggregation.
I'm using the AWS elasticsearch service which is in a 1.5 version, and scripts are disabled, so I did not find a way to sort the bucket by another term :(
Any idea ?
{
"aggs" : {
"label" : {
"terms" : { "field" : "label" },
"order" : { "energy[0]" : "desc" } // cannot do this
},
"aggs" : {
"energy" : {
"terms" : {
"field" : "energy",
"size" : 1
}
}
}
}
}

Related

Is there an node like statement available in mongodb

I want to filter the decimal value in child array of json file.In below sample code i want to apply the like function to get the json value like(t1,t2) in below sample file.
Sample code:
db.getCollection('temp').find({},{"temp.text./.*t.*/.value":1})
Sample Json file:
{
"_id" :0"),
"temp" : {
"text" : {
"t1" : {
"value" : "960"
},
"t2" : {
"value" : "959"
},
"t3" : {
"value" : "961"
},
"t4" : {
"value" : "962"
},
"t5" : {
"value" : "6.0"
}
}
}
}
MongoDB doesn't have a way to filter field names directly other than projection, which is exact match only.
However, using aggregation you can use $objectToArray, which would convert the object {"t1" : {"value" : "960"}} to [{"k":"t1","v":{"value":"960"}}]. You can then filter based on the value of k, and use $arrayToObject to convert the entries left back into an object.
.aggregate([
{$addFields:{
"temp.text":{
$arrayToObject:{
$filter:{
input:{$objectToArray:"$temp.text"},
cond:{
$regexMatch:{
input:"$$this.k",
regex:/t/
}
}
}
}
}
}}
])
Playground

Elasticsearch regexp query finds no results

I've a problem to build the correct query. I have an index with a field "ids" with the following mapping:
"ids" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
A sample content could look like this:
10,20,30
It's a list of ids. Now I want to make a query with multiple possible ids and I want to make a disjunction (OR) so I decided to use a regexp:
{
"query" : {
"bool" : {
"must" : [
{
"query_string" : {
"query" : "Test"
}
},
{
"regexp" : {
"ids" : {
"value" : "10031|20|10038",
"boost" : 1
}
}
}
]
}
},
"size" : 10,
"from" : 0
}
The query is executed successfully but with no results. I expected to find 3 results.
If you want to get 10031 or 20 or 10038, you need to add parenthesis.
Change "10031|20|10038" => "(10031|20|10038)"

Basic geosearch with ElasticSearch

I'm putting together a proof of concept on AWS using Dynamo and the Amazon ElasticSearch service, and I'm having some trouble getting
I've checked the ES Dashboard and see the following....
I have an index [assets] and a mapping [asset_types]. Below is a sample of some the mappings, with the relevant location
filename *string*
checksum *string*
added_date *date*
General [this is a map]
location
lat *string*
lon *string*
make *string*
model *string*
I want the geo searches to be on the "General.location" field. I've tried a couple different queries so far without any luck, but I'm sure I'm missing something rather obvious.
One is from the official documentation here ,modified to the below which results with this error:
"reason": "failed to parse search source. unknown search element [bool]",
POST assets/_search
{
"bool" : {
"must" : {
"match_all" : {}
},
"filter" : {
"geo_distance" : {
"distance" : "200km",
"General.location" : {
"lat" : 40,
"lon" : -70
}
}
}
}
}
I've also tried a slightly different query which raises ""reason": "failed to find geo_point field [General.location]"
POST assets/_search
{
"filter" : {
"geo_distance" : {
"distance" : "1km",
"General.location" : {
"lat" : 40,
"lon" : -70
}
}
},
"query" : {
"match_all" : {}
}
}
Am I running a query incorrectly? Do I need to update the mapping in the index to specify the geo-index? I thought if I formatted fields properly that wasn't a requirement.
Thanks
The issue lies in your mapping, where your General.location field is not properly mapped. That's the reason you get the error failed to find geo_point field
So instead of
General [this is a map]
location
lat *string*
lon *string*
You need to have
General [this is a map]
location *geo_point*
So you need to modify your mapping accordingly and reindex your data.
The second issue you have is that your first query needs to be enclosed in a query section:
POST assets/_search
{
"query" : {
"bool" : {
"must" : {
"match_all" : {}
},
"filter" : {
"geo_distance" : {
"distance" : "200km",
"General.location" : {
"lat" : 40,
"lon" : -70
}
}
}
}
}
}
Once you've fixed both issues, you'll be able to run your query successfully.
In addition to what Val said, I created a new mapping for location
I explicitly created a mapping for this. Note for you other novices out there, I needed to use nested properties update in order to create the "General.deviceLocation". After I did this, Val's update query worked.
PUT assets/_mapping/assets_type
{
"properties": {
"General": {
"properties": {
"deviceLocation": {
"type": "geo_point"
}
}
},
}
}

MongoDB Query For Fields That Vary - Wildcards?

I am looking for a way to get distinct "unit" values from a collection that has a structure similar to the following:
{
"_id" : ObjectId("548b1aee6e444414f00d5cf1"),
"KPI" : {
"NPV" : {
"value" : 100,
"unit" : "kUSD"
},
"NPM" : {
"value" : 100,
"unit" : "kUSD"
},
"GPM" : {
"value" : 50,
"unit" : "CAD"
}
}
}
I looked into using wildcards and regex but from what I have come across this is not supported for field matching. I would like to do something like db.collection.distinct('KPI.*.unit') but cannot determine how and it seems like performance would be poor. Does anyone have a recommendation? Thanks.
It's not a good practice to make the keys a part of the content of the document - don't use keys as data. If you don't change your document structure, you'll need to know what the possible subfields of KPI are. If you don't know what those could be, you will need to examine the documents manually to find them. Then you can issue a distinct for each using dot notation, e.g. db.collection.distinct("KPI.NPM.unit").
If what you're looking for instead is the distinct values of unit across all values of the parent KPI subfield, then you could take the union of all of the results of the distincts. You can also do it easily with an aggregation framework in MongoDB 2.6. For simplicity, I'll assume there's just three distinct subfields of KPI, the ones in the document above.
db.collection.aggregate([
{ "$group" : { "_id" : 0, "NPVunits" : { "$addToSet" : "$KPI.NPV.unit" }, "NPMunits" : { "$addToSet" : "$KPI.NPM.unit" }, "GPMunits" : { "$addToSet" : "$KPI.GPM.unit" } }
{ "$project" : { "distinct_units" : { "$setUnion" : ["$NPVunits", "$NPMunits", "$GPMunits"] } } }
])
You could also structure your data as dynamic attributes. The document above would be recast as something like
{
"_id" : ObjectId("548b1aee6e444414f00d5cf1"),
"KPI" : [
{ "type" : "NPV", "value" : 100, "unit" : "kUSD" },
{ "type" : "NPM", "value" : 100, "unit" : "kUSD" },
{ "type" : "GPM", "value" : 50, "unit" : "CAD" }
]
}
Querying for distinct units is easy now, whether you want it per type or over all types:
Per type (all types in one query)
db.collection.aggregate([
{ "$unwind" : "$KPI" },
{ "$group" : { "_id" : "$KPI.type", "units" : { "$addToSet" : "$KPI.unit" } } }
])
Over all types
db.collection.distinct("KPI.unit")

Can I do a MongoDB "starts with" query on an indexed subdocument field?

I'm trying to find documents where a field starts with a value.
Table scans are disabled using notablescan.
This works:
db.articles.find({"url" : { $regex : /^http/ }})
This doesn't:
db.articles.find({"source.homeUrl" : { $regex : /^http/ }})
I get the error:
error: { "$err" : "table scans not allowed:moreover.articles", "code" : 10111 }
There are indexes on both url and source.homeUrl:
{
"v" : 1,
"key" : {
"url" : 1
},
"ns" : "mydb.articles",
"name" : "url_1"
}
{
"v" : 1,
"key" : {
"source.homeUrl" : 1
},
"ns" : "mydb.articles",
"name" : "source.homeUrl_1",
"background" : true
}
Are there any limitations with regex queries on subdocument indexes?
When you disable table scans, it means that any query where a table scan "wins" in the query optimizer will fail to run. You haven't posted an explain but it's reasonable to assume that's what is happening here based on the error. Try hinting the index explicitly:
db.articles.find({"source.homeUrl" : { $regex : /^http/ }}).hint({"source.homeUrl" : 1})
That should eliminate the table scan as a possible choice and allow the query to return successfully.