Elastic search Query String in Rails 4 - ruby-on-rails-4

I am using Elasticsearch in my application to search for a matching word anywhere in a table.
This is the query string i have used to fetch my result:
search({ query: { prefix: { _all: keywords }}, sort: [ { start_date: 'asc', start_time: 'asc' } ] })
The selected records were then being queried with the dates to match the date range(s) specified in the application, by the following query:
where("status_id= ? and active=? and (((start_date >= ?) and (start_date <= ?))
or ((start_date <= ?) and (? <= end_date)))",2,true,range_start_date,
range_end_date,range_start_date,range_start_date)
But i know this is not a good way to fetch results. Now i want to modify this to fetch just the required data from elasticsearch index.
After a long search i found "query_string" and "simple_query_string" to match my requirement. But i am unsuccessful till now to get the required result.
How can i append the query with the elasticsearch result to get the required records?
Can someone please help?
Thanks in advance.

Finally, i was able to find the answer for the question myself. I was able to filter the searched content according to date with a "filter" keyword.
I modified the query as:
search_query = {
query: {
prefix: {
_all: keywords
}
},
filter: {
query: {
query_string: {
query: "status_id:2 AND active:true AND
((start_date:>=#{range_start_date} AND start_date:<=#{range_end_date}) (start_date:<=#{range_start_date} AND end_date:>=#{range_start_date}))"
}
}
},
sort: [ { start_date: 'asc', start_time: 'asc' } ] }
And finally fetched the result by:
#result = self.search(search_query)
If there is anyway i could modify this code, please suggest. Thank You.

Related

Not able to get desired search results in ElasticSearch search api

I have field "xyz" on which i want to search. The type of the field is keyword. The different values of the field "xyz "are -
a/b/c/d
a/b/c/e
a/b/f/g
a/b/f/h
Now for the following query -
{
"query": {
"query_string" : {
"query" : "(xyz:(\"a/b/c\"*))"
}
}
}
I should only get these two results -
a/b/c/d
a/b/c/e
but i get all the four results -
a/b/c/d
a/b/c/e
a/b/f/g
a/b/f/h
Edit -
Actually i am not directly querying on ElasticSearch, I am using this API https://atlas.apache.org/api/v2/resource_DiscoveryREST.html#resource_DiscoveryREST_searchWithParameters_POST which creates the above mentioned query for elasticsearch, so i dont have much control over the elasticsearch query_string. What i can change is the elasticsearch analyzer for this field or it's type.
You'll need to let the query_string parser know you'll be using regex so wrap the whole thing in /.../ and escape the forward slashes:
{
"query": {
"query_string": {
"query": "xyz:/(a\\/b\\/c\\/.*)/"
}
}
}
Or, you might as well use a regexp query:
{
"query": {
"regexp": {
"xyz": "a/b/c/.*"
}
}
}

appsync dynamodb won't return primary partition key

With thanks in advance as this is probably a 101 question - I can't find an answer anywhere.
I've set up what I think is a simple example of AppSync and DynamoDB.
In DynamoDB I have a categorys table, with items of the form
{
slug: String!,
nm: String,
nmTrail: String,
...
}
So - no id field. slug is he primary partition key, not null and expected to be unique (is unique in the data I've got loaded so far).
I've set up a simplified AppSync schema in line with the above definition and
a resolver...
{
"version": "2017-02-28",
"operation" : "GetItem",
"key" : {
"slug" : { "S" : "${context.arguments.slug}" }
}
}
A query such as
query GoGetOne {
getCategory(slug: "Wine") {
nm
}
}
Works fine - returning the nm value for the correct item in categorys - similarly I can add any of the other properties in categorys to return them (e.g. nmTrail) except slug.
If I add slug (the Primary Partition Key, a non-nullable String) to the result set then I get a DynamoDB:AmazonDynamoDBException of the provided key element does not match the schema (Service: AmazonDynamoDBv2; Status Code: 400; Error Code: ValidationException.
If I scan/query/filter the table in DynamoDB all is good.
Most of the AWS examples use an id: ID! field in the 'get one' examples and also ask for it as a returned item.
update1 in response to KDs request
My update mutation schema is:
type Mutation {
putCategory(
slug: String!,
nm: String,
nmTrail: String,
mainCategorySlug: String,
mainCategoryNm: String,
parentCategorySlug: String,
parentCategoryNm: String
): Category
}
No resolver associated with that and (obviously) therefore haven't used mutation to put anything yet - just trying to get batch uploaded data to begin with.
/update1
What am I missing?
I tried to reproduce your API as much as I could and it works for me.
Category DynamoDB table
Schema:
type Query {
getCategory(slug: String!): Category
}
type Category {
slug: String
nm: String
nmTrail: String
}
Resolver on Query.getCategory request template:
{
"version": "2017-02-28",
"operation": "GetItem",
"key": {
"slug": $util.dynamodb.toDynamoDBJson($ctx.args.slug),
}
}
Resolver on Query.getCategory response template:
$util.toJson($ctx.result)
Query:
query GoGetOne {
getCategory(slug: "Wine") {
slug
nm
}
}
Results
{
"data": {
"getCategory": {
"slug": "Wine",
"nm": "Wine1-nm"
}
}
}

Map different Sort Key responses to Appsync Schema values

So here is my schema:
type Model {
PartitionKey: ID!
Name: String
Version: Int
FBX: String
# ms since epoch
CreatedAt: AWSTimestamp
Description: String
Tags: [String]
}
type Query {
getAllModels(count: Int, nextToken: String): PaginatedModels!
}
type PaginatedModels {
models: [Model!]!
nextToken: String
}
I would like to call 'getAllModels' and have all of it's data, and all of it's tags be filled in.
But here is the thing. Tags are stored via sort keys. Like so
PartionKey | SortKey
Model-0 | Model-0
Model-0 | Tag-Tree
Model-0 | Tag-Building
Is it possible to transform the 'Tag' sort keys into the Tags: [String] array in the schema via a DynamoDB resolver? Or must I do something extra fancy through a lambda? Or is there a smarter way to do this?
To clarify, are you storing objects like this in DynamoDB:
{ PartitionKey (HASH), Tag (SortKey), Name, Version, FBX, CreatedAt, Description }
and using a DynamoDB Query operation to fetch all rows for a given HashKey.
Query #PartitionKey = :PartitionKey
and getting back a list of objects some of which have a different "Tag" value and one of which is "Model-0" (aka the same value as the partition key) and I assume that record contains all other values for the record. E.G.
[
{ PartitionKey, Tag: 'ValueOfPartitionKey', Name, Version, FBX, CreatedAt, ... },
{ PartitionKey, Tag: 'Tag-Tree' },
{ PartitionKey: Tag: 'Tag-Building' }
]
You can definitely write resolver logic without too much hassle that reduces the list of model objects into a single object with a list of "Tags". Let's start with a single item and see how to implement a getModel(id: ID!): Model query:
First define the response mapping template that will get all rows for a partition key:
{
"version" : "2017-02-28",
"operation" : "Query",
"query" : {
"expression": "#PartitionKey = :id",
"expressionValues" : {
":id" : {
"S" : "${ctx.args.id}"
}
},
"expressionNames": {
"#PartitionKey": "PartitionKey" # whatever the table hash key is
}
},
# The limit will have to be sufficiently large to get all rows for a key
"limit": $util.defaultIfNull(${ctx.args.limit}, 100)
}
Then to return a single model object that reduces "Tag" to "Tags" you can use this response mapping template:
#set($tags = [])
#set($result = {})
#foreach( $item in $ctx.result.items )
#if($item.PartitionKey == $item.Tag)
#set($result = $item)
#else
$util.qr($tags.add($item.Tag))
#end
#end
$util.qr($result.put("Tags", $tags))
$util.toJson($result)
This will return a response like this:
{
"PartitionKey": "...",
"Name": "...",
"Tags": ["Tag-Tree", "Tag-Building"],
}
Fundamentally I see no problem with this but its effectiveness depends upon your query patterns. Extending this to the getAll use is doable but will require a few changes and most likely a really inefficient Scan operation due to the fact that the table will be sparse of actual information since many records are effectively just tags. You can alleviate this with GSIs pretty easily but more GSIs means more $.
As an alternative approach, you can store your Tags in a different "Tags" table. This way you only store model information in the Model table and tag information in the Tag table and leverage GraphQL to perform the join for you. In this approach have Query.getAllModels perform a "Scan" (or Query) on the Model table and then have a Model.Tags resolver that performs a Query against the Tag table (HK: ModelPartitionKey, SK: Tag). You could then get all tags for a model and later create a GSI to get all models for a tag. You do need to consider that now the nested Model.Tag query will get called once per model but Query operations are fast and I've seen this work well in practice.
Hope this helps :)

How to format date in Logstash Configuration

I am using logstash to parse log entries from an input log file.
LogLine:
TID: [0] [] [2016-05-30 23:02:02,602] INFO {org.wso2.carbon.registry.core.jdbc.EmbeddedRegistryService} - Configured Registry in 572ms {org.wso2.carbon.registry.core.jdbc.EmbeddedRegistryService}
Grok Pattern:
TID:%{SPACE}\[%{INT:SourceSystemId}\]%{SPACE}\[%{DATA:ProcessName}\]%{SPACE}\[%{TIMESTAMP_ISO8601:TimeStamp}\]%{SPACE}%{LOGLEVEL:MessageType}%{SPACE}{%{JAVACLASS:MessageTitle}}%{SPACE}-%{SPACE}%{GREEDYDATA:Message}
My grok pattern is working fine. I am sending these parse entries to an rest base api made by myself.
Configurations:
output {
stdout { }
http {
url => "http://localhost:8086/messages"
http_method => "post"
format => "json"
mapping => ["TimeStamp","%{TimeStamp}","CorrelationId","986565","Severity","NORMAL","MessageType","%{MessageType}","MessageTitle","%{MessageTitle}","Message","%{Message}"]
}
}
In the current output, I am getting the date as it is parsed from the logs:
Current Output:
{
"TimeStamp": "2016-05-30 23:02:02,602"
}
Problem Statement:
But the problem is that my API is not expecting the date in such format, it is expecting the date in generic xsd type i.e datetime format. Also, as mentioned below:
Expected Output:
{
"TimeStamp": "2016-05-30T23:02:02:602"
}
Can somebody please guide me, what changes I have to add in my filter or output mapping to achieve this goal.
In order to transform
2016-05-30 23:02:02,602
to the XSD datetime format
2016-05-30T23:02:02.602
you can simply add a mutate/gsub filter in order to replace the space character with a T and the , with a .
filter {
mutate {
gsub => [
"TimeStamp", "\s", "T",
"TimeStamp", ",", "."
]
}
}

Filter spatial view response in couchbase

Below is the response from my spatial view query in Couchbase by providing bounding box parameters:
{
"rows":[
{
"geometry":{
"type":"Point",
"coordinates":[
-71.10364,
42.381411
]
},
"value":{
"location":{
"type":"Point",
"coordinates":[
-71.10364,
42.381411
]
},
"name":"test",
"visibility":"public",
},
"id":"test",
"key":[
[
-71.10364,
-71.10364
],
[
42.381411,
42.381411
]
]
}
]
}
and here is my spatial view query:-
function (doc, meta) {
if (doc.type == "folder" && doc.location && doc.location.type) {
if(doc.location.type=='Point'){
var visibility = doc.enabled === true ? 'public' : 'private';
emit(doc.location, {
name:doc.name,
folder_id:doc.folder_id,
location: doc.location,
visibility:visibility
});
}
}
}
but the JSON response contains unwanted data, so i am wondering how can i remove geometry and key parameter from json response.
Also query returns first 10 records, is there any way so i can set limit and skip parameters so query return all data instead first 10.
To answer the 2nd half of your question (please post two separate questions next time): yes, views support pagination. You can set the number of results. you can ask for x results per page and for different pages.
See this: http://blog.couchbase.com/pagination-couchbase
And also: dev-views only work on part of your bucket. Publish them to get results that corresponds to the entire data.
You can't remove the geometry and key - both are part of the result. If you don't want to use them when simply don't do anything with them.