extract a value from a googlemaps JSON response - regex

My JSON_Respon from googlemap API give
%{ body: body} = HTTPoison.get! url
body = {
"geocoded_waypoints" : [{ ... },{ ... }],
"routes" : [{
"bounds" : { ...},
"copyrights" : "Map data ©2018 Google",
"legs" : [
{
"distance" : {
"text" : "189 km",
"value" : 188507
},
"duration" : {
"text" : "2 hours 14 mins",
"value" : 8044
},
"end_address" : "Juhan Liivi 2, 50409 Tartu, Estonia",
"end_location" : {
"lat" : 58.3785389,
"lng" : 26.7146963
},
"start_address" : "J. Sütiste tee 44, 13420 Tallinn, Estonia",
"start_location" : {
"lat" : 59.39577569999999,
"lng" : 24.6861104
},
"steps" : [
{ ... },
{ ... },
{ ... },
{ ... },
{
"distance" : {
"text" : "0.9 km",
"value" : 867
},
"duration" : {
"text" : "2 mins",
"value" : 104
},
"end_location" : {
"lat" : 59.4019886,
"lng" : 24.7108114
},
"html_instructions" : "XXXX",
"maneuver" : "turn-left",
"polyline" : {
"points" : "XXXX"
},
"start_location" : {
"lat" : 59.3943677,
"lng" : 24.708647
},
"travel_mode" : "DRIVING"
},
{ ... },
{ ... },
{ ... },
{ ... },
{ ... },
{ ... },
{ ... },
{ ... },
{ ... }
],
"traffic_speed_entry" : [],
"via_waypoint" : []
}
],
"overview_polyline" : { ... },
"summary" : "Tallinn–Tartu–Võru–Luhamaa/Route 2",
"warnings" : [],
"waypoint_order" : []
}
],
"status" : "OK"
}
(check the attached image)
in red what I'm getting with with bellow command from Regex.named_captures module
%{"duration_text" => duration_text, "duration_value" => duration_value} = Regex.named_captures ~r/duration\D+(?<duration_text>\d+ mins)\D+(?<duration_value>\d+)/, body
in bleu (check the attached image), what I want to extract from body
body is the JSON response of my googleAPI url on a browser
Would you please assist and provide the regex ?
Since http://www.elixre.uk/ is down, i'm cant find any api helping to do that
Thanks in advance

Don't use regexes on a json string. Instead, convert the json string to an elixir map using Jason, Poison, etc., then use the keys in the map to lookup the data you are interested in.
Here's an example:
json_map = Jason.decode!(get_json())
[first_route | _rest] = json_map["routes"]
[first_leg | _rest] = first_route["legs"]
distance = first_leg["distance"]
=> %{"text" => "189 km", "value" => 188507}
Similarly, you can get the other parts with:
duration = first_leg["duration"]
end_address = first_leg["end_address"]
...
...

Related

MongoDB Aggregate - Match condition only on second collection but provide all documents from the first

Basically, I have 2 collections in my MongoDB database -> Books, Scores.
Books
{
"BOOK_ID" : "100",
"BOOK_NAME" : "Book 1",
"BOOK_DESC" : "abcd",
},
{
"BOOK_ID" : "101",
"BOOK_NAME" : "Book 2",
"BOOK_DESC" : "efgh",
},
{
"BOOK_ID" : "102",
"BOOK_NAME" : "Book 3",
"BOOK_DESC" : "ijkl",
}
Scores
{
"BOOK_ID" : "100",
"BOOK_CATEGORY" : "kids",
"BOOK_SCORE" : "6",
},
{
"BOOK_ID" : "100",
"BOOK_CATEGORY" : "Educational",
"BOOK_SCORE" : "8",
},
{
"BOOK_ID" : "101",
"BOOK_CATEGORY" : "Kids",
"BOOK_SCORE" : "6",
},
{
"BOOK_ID" : "101",
"BOOK_CATEGORY" : "Fantasy",
"BOOK_SCORE" : "7",
}
Expected output :
Searching for all books with BOOKS_CATEGORY="Kids" and `BOOKS_SCORE=6``
{
"BOOK_ID" : "100",
"BOOK_NAME" : "Book 1",
"BOOK_DESC" : "abcd",
"BOOK_CATEGORY" : "Kids",
"BOOK_SCORE" : 6
},
{
"BOOK_ID" : "101",
"BOOK_NAME" : "Book 2",
"BOOK_DESC" : "efgh",
"BOOK_CATEGORY" : "Kids",
"BOOK_SCORE" : 6
},
{
"BOOK_ID" : "102",
"BOOK_NAME" : "Book 3",
"BOOK_DESC" : "ijkl",
}
Notice that, for all the books to which scores are available, they are appended. If a Book does not have any score associated, it still comes in the result.
What I have tried?
I have tried using $lookup
pipeline = [
{
"$lookup": {
"from": "Scores",
"pipeline":[
{
"$match" : {
"BOOK_CATEGORY" : "Kids",
"BOOK_SCORE" : "6",
}
}
],
"localField": "BOOK_ID",
"foreignField": "BOOK_ID",
"as": "SCORES",
},
},
]
db.Books.aggregate(pipeline)
Also, by reading the $lookup subquery docs,(https://www.mongodb.com/docs/manual/reference/operator/aggregation/lookup/#join-conditions-and-subqueries-on-a-joined-collection)
I got the feeling that what I am expecting may not be possible.
Can anyone help me with executing such query? (I use PyMongo btw)
For the last two stages:
$replaceRoot - Replace the input document(s) with the new document(s) by merging the current document with the document which is the first document from the SCORES array.
$unset - Remove SCORES array.
db.Books.aggregate([
{
"$lookup": {
"from": "Scores",
"pipeline": [
{
"$match": {
"BOOK_CATEGORY": "Kids",
"BOOK_SCORE": "6",
}
}
],
"localField": "BOOK_ID",
"foreignField": "BOOK_ID",
"as": "SCORES"
}
},
{
"$replaceRoot": {
"newRoot": {
"$mergeObjects": [
"$$ROOT",
{
$first: "$$ROOT.SCORES"
}
]
}
}
},
{
$unset: "SCORES"
}
])
Sample Mongo Playground
You can achieve this by using a conditional $addFields, if the $lookup value exists then populate the values, else use $$REMOVE to remove the field, like so:
db.Books.aggregate([
{
"$lookup": {
"from": "Scores",
"pipeline": [
{
"$match": {
"BOOK_CATEGORY": "kids",
"BOOK_SCORE": "6"
}
}
],
"localField": "BOOK_ID",
"foreignField": "BOOK_ID",
"as": "SCORES"
}
},
{
$addFields: {
SCORES: "$$REMOVE",
"BOOK_SCORE": {
$cond: [
{
"$ifNull": [
{
"$arrayElemAt": [
"$SCORES",
0
]
},
false
]
},
{
$getField: {
field: "BOOK_SCORE",
input: {
"$arrayElemAt": [
"$SCORES",
0
]
}
}
},
"$$REMOVE"
]
},
"BOOK_CATEGORY": {
$cond: [
{
"$ifNull": [
{
"$arrayElemAt": [
"$SCORES",
0
]
},
false
]
},
{
$getField: {
field: "BOOK_CATEGORY",
input: {
"$arrayElemAt": [
"$SCORES",
0
]
}
}
},
"$$REMOVE"
]
},
}
}
])
Mongo Playground

GroupBy on a partition then count in Opensearch: Group By on multiple fields

I have the following data
{
"companyID" : "amz",
"companyType" : "ret",
"employeeID" : "ty-5a62fd78e8d20ad"
},
{
"companyID" : "amz",
"companyType" : "ret",
"employeeID" : "ay-5a62fd78e8d20ad"
},
{
"companyID" : "mic",
"companyType" : "cse",
"employeeID" : "by-5a62fd78e8d20ad"
},
{
"companyID" : "ggl",
"companyType" : "cse",
"employeeID" : "ply-5a62fd78e8d20ad"
},
{
"companyID" : "ggl",
"companyType" : "cse",
"employeeID" : "wfly-5a62ad"
}
I want the following result. basically combination of values like this mic-cse,ggl-cse,amz-ret .
"agg_by_company_type" : {
"buckets" : [
{
"key" : "ret",
"doc_count" : 1
},
{
"key" : "cse",
"doc_count" : 2
}
]
How do I do it?
I have tried the following aggregations:
"agg_by_companyID_topHits": {
"terms": {
"field": "companyID.keyword",
"size": 100000,
"min_doc_count": 1,
"shard_min_doc_count": 0,
"show_term_doc_count_error": true,
"order": {
"_key": "asc"
}
},
"aggs": {
"agg_by_companyType" : {
"top_hits": {
"size": 1,
"_source": {
"includes": ["companyType"]
}
}
}
}
}
But this just gives me first groupBy of company id now on top of that data I want count of company type.
this is the response I get
"agg_by_companyID_topHits" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "amz",
"doc_count" : 2,
"doc_count_error_upper_bound" : 0,
"agg_by_companytype" : {
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 0.0,
"hits" : [
{
"_index" : "my-index",
"_type" : "_doc",
"_id" : "uytuygjuhg",
"_score" : 0.0,
"_source" : {
"companyType" : "ret"
}
}
]
}
}
},
{
"key" : "mic",
"doc_count" : 1,
"doc_count_error_upper_bound" : 0,
"agg_by_companytype" : {
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 0.0,
"hits" : [
{
"_index" : "my-index",
"_type" : "_doc",
"_id" : "uytuygjuhg",
"_score" : 0.0,
"_source" : {
"companyType" : "cse"
}
}
]
}
}
},
{
"key" : "ggl",
"doc_count" : 2,
"doc_count_error_upper_bound" : 0,
"agg_by_companytype" : {
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 0.0,
"hits" : [
{
"_index" : "my-index",
"_type" : "_doc",
"_id" : "uytuygjuhg",
"_score" : 0.0,
"_source" : {
"companyType" : "ret"
}
}
]
}
}
},
]
}
If it were spark, it would be simple to partition by companyID, group it and then group by companyType and count to get the desired result but not sure how to do it in ES.
Important Note: I am working with Opensearch.
Possible solution for this in elastic search multi-terms-aggregation
is not available in versions before v7.12.
So wondering how it was done before this feature in ES.
We came across this issue because AWS migrated from ES to Opensearch.
use multi_terms agg doc here
GET /products/_search
{
"aggs": {
"genres_and_products": {
"multi_terms": {
"terms": [{
"field": "companyID"
}, {
"field": "companyType"
}]
}
}
}
}
can you use script in terms agg ,like this:
GET b1/_search
{
"aggs": {
"genres": {
"terms": {
"script": {
"source": "doc['companyID'].value+doc['companyType'].value",
"lang": "painless"
}
}
}
}
}

Is there an json filter option available in mongodb

In below json file i want to exactly filter the decimal value in child array of text.while excute the sample code the output shown the all the json value instead of filtered value.
sample json file:
{
"_id" : ObjectId("01"),
"project_id" : "100",
"snapshot" : "Symbol",
"obs" : {
"land" : {
"b2" : {
"points" : [
"p1",
"p2"
]
}
"points" : {
"p1" : {
"position" : [
123.00000000,
123.00000000
]
},
"p2" : {
"position" : [
1235.896523,
]
}
"text" : {
"t2" : {
"box" : [
[
123.0,
3361.0
],
[
117,
60
],
0.0
],
"value" : "813"
},
"t3" : {
"box" : [
[
1260.76745605469,
726.63720703125
],
[
51.4486427307129,
88.5970306396484
],
-36.2538375854492
],
"value" : "27.06"
}
}
sample code:
db.getCollection('obs').aggregate([{$match: {project_id: "100",snapshot : "Symbol"}},
{$addFields:{
"obs.text":{
$arrayToObject:{
$filter:{
input:{$objectToArray:"$obs.text"},
cond:{
$regexMatch:{
input:"$$this.k",
regex:/t/
}
}
}
}
}
}}
])
Excepted output:
"_id" : ObjectId("01"),
"project_id" : "100",
"snapshot" : "Symbol",
"obs" : {
"text" : {
"t3" : {
"value" : "27.06"
}
}
}

Elastic Search only matches full field

I have just started using Elastic Search 6 on AWS.
I have inserted data into my ES endpoint but I can only search it using the full sentence and not match individual words. In the past I would have used not_analyzed it seems, but this has been replaced by 'keyword'. However this still doesn't work.
Here is my index:
{
"seven" : {
"aliases" : { },
"mappings" : {
"myobjects" : {
"properties" : {
"id" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"myId" : {
"type" : "text"
},
"myUrl" : {
"type" : "text"
},
"myName" : {
"type" : "keyword"
},
"myText" : {
"type" : "keyword"
}
}
}
},
"settings" : {
"index" : {
"number_of_shards" : "5",
"provided_name" : "seven",
"creation_date" : "1519389595593",
"analysis" : {
"filter" : {
"nGram_filter" : {
"token_chars" : [
"letter",
"digit",
"punctuation",
"symbol"
],
"min_gram" : "2",
"type" : "nGram",
"max_gram" : "20"
}
},
"analyzer" : {
"nGram_analyzer" : {
"filter" : [
"lowercase",
"asciifolding",
"nGram_filter"
],
"type" : "custom",
"tokenizer" : "whitespace"
},
"whitespace_analyzer" : {
"filter" : [
"lowercase",
"asciifolding"
],
"type" : "custom",
"tokenizer" : "whitespace"
}
}
},
"number_of_replicas" : "1",
"uuid" : "_vNXSADUTUaspBUu6zdh-g",
"version" : {
"created" : "6000199"
}
}
}
}
}
I have data like this:
{
"took" : 3,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 13,
"max_score" : 1.0,
"hits" : [
{
"_index" : "seven",
"_type" : "myobjects",
"_id" : "8",
"_score" : 1.0,
"_source" : {
"myUrl" : "https://myobjects.com/wales.gif",
"myText" : "Objects for Welsh Things",
"myName" : "Wales"
}
},
{
"_index" : "seven",
"_type" : "myobjects",
"_id" : "5",
"_score" : 1.0,
"_source" : {
"myUrl" : "https://myobjects.com/flowers.gif",
"myText" : "Objects for Flowery Things",
"myNoun" : "Flowers"
}
}
]
}
}
If I then search for 'Objects' I get nothing. If I search for 'Objects for Flowery Things' I get the single result.
I am using this to search for items :
POST /seven/objects/_search?pretty
{
"query": {
"multi_match" : { "query" : q, "fields": ["myText", "myNoun"], "fuzziness":"AUTO" }
}
}
Can anybody tell me how to have the search match any word in the sentence rather than having to put the whole sentence in the query?
This is because your myName and myText fields are of keyword type:
...
"myName" : {
"type" : "keyword"
},
"myText" : {
"type" : "keyword"
}
...
and because of this they are not analyzed and only full match will work for them. Change the type to text and it should work as you expected:
...
"myName" : {
"type" : "text"
},
"myText" : {
"type" : "text"
}
...

Search any part of word in any column

I'm trying to search full_name, email or phone
For example
if i start input "+16", it should display all users with phone numbers start or contains "+16". The same with full name and email
My ES config is:
{
"users" : {
"mappings" : {
"user" : {
"properties" : {
"full_name" : {
"analyzer" : "trigrams",
"include_in_all" : true,
"type" : "string"
},
"phone" : {
"type" : "string",
"analyzer" : "trigrams",
"include_in_all" : true
},
"email" : {
"analyzer" : "trigrams",
"include_in_all" : true,
"type" : "string"
}
},
"dynamic" : "false"
}
},
"settings" : {
"index" : {
"creation_date" : "1472720529392",
"number_of_shards" : "5",
"version" : {
"created" : "2030599"
},
"uuid" : "p9nOhiJ3TLafe6WzwXC5Tg",
"analysis" : {
"analyzer" : {
"trigrams" : {
"filter" : [
"lowercase"
],
"type" : "custom",
"tokenizer" : "my_ngram_tokenizer"
}
},
"tokenizer" : {
"my_ngram_tokenizer" : {
"type" : "nGram",
"max_gram" : "12",
"min_gram" : "2"
}
}
},
"number_of_replicas" : "1"
}
},
"aliases" : {},
"warmers" : {}
}
}
Searching for name 'Robert' by part of name
curl -XGET 'localhost:9200/users/_search?pretty' -d'
{
"query": {
"match": {
"_all": "rob"
}
}
}'
doesn't give expected result, only using full name.
Since your analyzer is set on the fields full_name, phone and email, you should not use the _all field but enumerate those fields in your multi_match query, like this:
curl -XGET 'localhost:9200/users/_search?pretty' -d'{
"query": {
"multi_match": {
"query": "this is a test",
"fields": [
"full_name",
"phone",
"email"
]
}
}
}'